
LYPEPTIDES THAT BIND HIV gpl2 0 AND RELATED NUCLEIC ACIDS, 
ANTIBODIES, COMPOSITIONS, AND METHODS OF USE 

TECHNICAL FIELD OF THE INVENTION 
The present invention relates to polypeptides with 
homology to regions of domains of the human chemokine 
receptors CCR5, CXCR4 , and STRL33, as well as domains of 
CD4 that bind with human immunodeficiency virus (HIV) , in 
particular HIV-1 glycoprotein 12 0 (gpl2 0) envelope protein. 
The present invention also relates to nucleic acids 
encoding such polypeptides, antibodies, compositions 
comprising such polypeptides, nucleic acids or antibodies, 
and methods of using the same. 

BACKGROUND OF THE INVENTION 

There are seven transmembrane chemokine receptors that 
act as cof actors for HIV infection. The cof actors enable 
entry of HIV-1 into CD4 + T cells and macrophages (Premack et 
al . , Nature Medicine 2: 1174-78 (1996); and Zhang et al . , 
Nature 383: 768 (1996)). 

The presence of chemokines has an inhibitory effect on 
HIV-1 attachment to, and infection of, susceptible cells. 
Additionally, some mutations in chemokine receptors have 
been shown to result in resistance to HIV-1 infection. For 
example, a 32 -nucleotide deletion within the CCR5 gene has 
been described in subjects who remained uninfected despite 
repeated exposures to HIV-1 (Huang et al . , Nature Medicine 
2 : 1240-43 (1996) ) . 

Evidence also exists for the physical association of a 
ternary complex between chemokine receptors, CD4 , and HIV-1 
gpl20 envelope glycoprotein on cell membranes (Lapham et 
al . , Science 274: 602-05 (1996)). Receptor signaling and 
cell activation are probably not required for the 
anti-HIV-1 effect of chemokines since a RANTES analog 
lacking the first eight amino- terminal amino acids, RANTES 
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(9-68) , lacked chemotactic and leukocyte-activating 
properties, but bound to multiple chemokine receptors and 
inhibited infection by macrophage- tropic HIV-1 (Arenzana- 
Seladedos et al . , Nature 383: 400 (1996)). Cumulatively, 
5 the above described results suggest that the interaction 
between gpl2 0, CD4 , and at least one chemokine receptor is 
obligatory for HIV-1 infection. Accordingly, reagents that 
interfere with the binding of gpl2 0 to chemokine receptors 
and to CD4 are used in the biological and medical arts. 

10 However, there presently exists a need for additional 

reagents that can compete with one or more proteins of the 
gpl2 0-CD4 -chemokine receptor complex to assist in basic 
biological or viral research, and to assist in medical 
intervention in the HIV-1 pandemic. It is an object of the 

15 present invention to provide such reagents. This and other 
objects and advantages, including additional inventive 
features, will be apparent from the description provided 
herein. 

2 0 BRIEF SUMMARY OF THE INVENTION 

The present invention provides a polypeptide that 

binds with HIV gpl20 under physiological conditions. 

Multiple embodiments of the present inventive polypeptide 

are provided, and each embodiment possesses a degree of 
25 homology to at least one of the human CCR5 , CXCR4 and 

STRL33 chemokine receptors, and the human CD4 cell -surface 

protein. 

In a first embodiment, the present invention provides 
a polypeptide comprising the amino acid sequence YDIXYYXXE 

30 (SEQ ID NO: 1), wherein X is any synthetic or naturally 

occurring amino acid residue, and the polypeptide comprises 
less than about 100 contiguous amino acids that are 
identical to, or, in the alternative, substantially 
identical to, the amino acid sequence of the human CCR5 

35 chemokine receptor. A preferred polypeptide of this first 
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embodiment comprises the amino acid sequence YDIN*YYT*S*E 
(SEQ ID NO: 3) . A more preferred polypeptide of this first 
embodiment comprises the amino acid sequence YDINYYTSE (SEQ 
ID NO: 3), wherein each letter is the standard one-letter 
5 abbreviation for an amino acid residue (i.e., for example, 
N denotes asparaginyl, T denotyes threoninyl, and S denotes 
serinyl) . The polypeptide of the first embodiment can 
comprise the amino acid sequence 

M*D*YQ*V*S*SP*IYDIN*YYT*S*E (SEQ ID NO: 5) . Preferably, 

10 the polypeptide comprises the amino acid sequence 
MDYQVSSPI YDINYYTSE (SEQ ID NO: 5) . 

In a second embodiment, the present invention provides 
a polypeptide comprising the amino acid sequence 
XEXIXIYXXXNYXXX (SEQ ID NO: 6), wherein X is any synthetic 

15 or naturally occurring amino acid and wherein said 

polypeptide comprises less than about 100 contiguous amino 
acid that are identical to or substantially identical to 
the amino acid sequence of the human CXCR4 chemokine 
receptor. The polypeptide can consist essentially of, or 

20 consist of, the sequence EXIXIYXXXNY (SEQ ID NO: 7). 
Preferably, the polypeptide comprises the sequence 
M*EG*IS*IYT*S*D*NYT*E*E* . Preferably, 
M*EG*IS*IYT*S*D*NYT*E*E* is M*EGISIYTSDNYT*E*E* . 

In a third embodiment, the present invention provides 

25 a polypeptide comprising the amino acid sequence EHQAFLQFS 
(SEQ ID NO: 10) , wherein said polypeptide comprises less 
than about 100 contiguous amino acids that are identical to 
or substantially identical to the amino acid sequence of 
the human STRL3 3 chemokine receptor. The polypeptide can 

30 consist essentially of, or consist of, the sequence 
EHQAFLQFS (SEQ ID NO: 10) . 

In a fourth embodiment, the present invention provides 
a polypeptide comprising at least a portion of an amino 
acid sequence selected from the group consisting of 

35 LPPLYSLVFIFGFVGNML (SEQ ID NO: 11) , QWDFGNTMCQLLTGLYF I GFFS 



(SEQ ID NO: 12), SQYQFWKNFQTLKI VI LG (SEQ ID NO: 13), 
APYNIVLLLNTFQEFFGLNNCS (SEQ ID NO : 14), and 
YAFVGEKFRNYLLVFFQK (SEQ ID NO: 15) , wherein said 
polypeptide comprises less than about 100 contiguous amino 
5 acids that are identical to or substantially identical to 
the amino acid sequence of the human CCR5 chemokine 
receptor . 

In a fifth embodiment, the present invention provides 
a polypeptide comprising at least a portion of an amino 

10 acid sequence selected from the group consisting of 

LLLTIPDFIFANVSEADD (SEQ ID NO: 16) , WFQFQHIMVGLILPGIV (SEQ 
ID NO: 17), and IDSFILLEIIKQGCEFEN (SEQ ID NO: 18), wherein 
said polypeptide comprises less than about 100 contiguous 
amino acids that are identical to or substantially 

15 identical to the amino acid sequence of the human CXCR4 
chemokine receptor . 

In a sixth embodiment, the present invention provides 
a polypeptide comprising at least a portion of an amino 
acid sequence selected from the group consisting of 

20 LVI S I FYHKLQSLTDVFL (SEQ ID NO: 19), P F WAY AG I HE WVFGQVMC (SEQ 
ID NO: 20), EAI STWLATQMTLGFFL (SEQ ID NO: 21), 
LTMIVCYSVIIKTLLHAG (SEQ ID NO: 22), MAVFLLTQMPFNLMKF I RSTHW 
(SEQ ID NO: 23), HWE Y YAMTS FHYT I MVTE (SEQ ID NO: 24), 
ACLNPVLYAFVSLKFRKN (SEQ ID NO: 25) and SKTFSASHNVEATSMFQL 

25 (SEQ ID NO: 26) , wherein said polypeptide comprises less 

than about 100 contiguous amino acids that are identical to 
or substantially identical to the amino acid sequence of 
the human STRL3 3 chemokine receptor. 

In a seventh embodiment, the present invention 

30 provides a polypeptide comprising at least a portion of an 
amino acid sequence selected from the group consisting of 
DTYICEVED (SEQ ID NO: 27), EEVQLLVFGLTANSD (SEQ ID NO: 28), 
THLLQGQSLTLTLES (SEQ ID NO: 2 9), and GEQVEFSFPLAFTVE (SEQ 
ID NO: 30) , wherein said polypeptide comprises less than 

35 about 100 contiguous amino acids that are identical to or 
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substantially identical to the amino acid sequence of the 
human CD4 cell -surface protein. 

In the fourth to seventh embodiments, any selected 
portion of the polypeptide can comprise from 1 to about 6 
5 conservative amino acid substitutions. In an alternative, 
the polypeptide can be partially defined by an absence of a 
polypeptide sequence, outside the region of the portion 
selected from the foregoing sequences, that has five, or 
ten, contiguous amino acid residues that have a sequence 

10 that consists of an amino acid sequence that is identical 
to or substantially identical to the protein to which the 
polypeptide has homology (i.e., CCR5, CXCR4 , STRL33, or 
CD4) . In yet another alternative, the polypeptide can lack 
a sequence of five or ten contiguous amino acids which are 

15 identical to or substantially identical to the sequence of 
the protein with which the sequence has homology except 
that one or more conservatively or neutrally substituted 
amino acids replace part of the sequence of the protein to 
which the polypeptide has homology. Additionally, any 

20 embodiment of the present inventive polypeptide can also 
comprise a pharmaceutically acceptable substituent. 

Any embodiment of the present inventive polypeptide 
can be incorporated into a composition, which further 
comprises a carrier. Any suitable embodiment of the 

25 present inventive polypeptide can be encoded by a nucleic 
acid that can be expressed in a cell. In this regard, the 
present invention further provides a vector comprising such 
a nucleic acid. The nucleic acids and vectors also can be 
incorporated into a composition comprising a carrier. 

30 Additionally, the present invention provides a method 

of making an antibody to a polypeptide of the present 
invention. The present invention also provides a method of 
prophylactically or therapeutically treating an HIV 
infection in a mammal. 
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Additionally, the present invention provides an anti- 
idiotypic antibody comprising an internal image of a 
portion of gpl20, as well as a method of selecting such an 
antibody. 

5 The present invention also provides a method of making 

an antibody to a portion of the gpl2 0 protein that binds 
with a portion of CCR5 , CXCR4 , STRL33, or CD4 , as well as 
the immunizing compound used to make the antibody, and the 
antibody itself. In another embodiment of the present 
10 invention, a method of removing HIV-1 from a bodily fluid 
is provided. 

BRIEF DESCRIPTION OF THE DRAWINGS 
Figure 1 depicts a listing of synthetic amino acids 
15 available (from Bachem, King of Prussia, PA) for 

incorporation into polypeptides of the present invention. 

DETAILED DESCRIPTION OF THE INVENTION 
The present invention provides a polypeptide that 

20 binds with gpl20 of HIV, in particular HIV-1, more 

particularly HIV-Il^, under physiological conditions. The 
polypeptide has a number of uses including, but not limited 
to, the use of the polypeptide to elucidate the mechanism 
by which HIV, such as HIV-1, attaches to and/or infects a 

25 particular cell, to induce an immune response in a mammal, 
in particular a human, to HIV, in particular HIV-1, and to 
inhibit the replication of HIV, in particular HIV-1, in an 
infected mammal, in particular a human. 

Multiple embodiments of the present inventive 

3 0 polypeptide are provided. Each embodiment of the 

polypeptide has a degree of homology to at least one of the 
human CCR5, CXCR4 and STRL33 chemokine receptors, or the 
human CD4 cell -surface protein. In each embodiment 
provided herein, a letter indicates the standard amino acid 

35 designated by that letter, and a letter followed directly 
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by an asterisk (*) preferably represents the amino acid 
represented by the letter (e.g., N represents asparaginyl 
and T represents threoninyl) , or a synthetic or naturally 
occurring conservative or neutral substitution therefor. 
5 Additionally, in accordance with convention, all amino acid 
sequences provided herein are given either from left to 
right, or top to bottom, such that the first amino acid is 
amino- terminal and the last is carboxyl -terminal . The 
synthesis of polypeptides, either synthetically (i.e., 
10 chemically) or biologically, is within the skill in the 
art . 

It is within the skill of the ordinary artisan to 
select synthetic and naturally occurring amino acids that 
make conservative or neutral substitutions for any 

15 particular naturally occurring amino acids. The skilled 
artisan desirably will consider the context in which any 
particular amino acid substitution is made, in addition to 
considering the hydrophobicity or polarity of the side- 
chain, the general size of the side chain, and the pK value 

20 of side-chains with acidic or basic character under 

physiological conditions. For example, lysine, arginine, 
and histidine are often suitably substituted for each 
other, and more often arginine and lysine. As is known in 
the art, this is because all three amino acids have basic 

25 side chains, whereas the pK value for the side-chains of 

lysine and arginine are much closer to each other (about 10 
and 12) than to histidine (about 6) . Similarly, glycine, 
alanine, valine, leucine, and isoleucine are often suitably 
substituted for each other, with the proviso that glycine 

3 0 is frequently not suitably substituted for the other 

members of the group. This is because each of these amino 
acids are relatively hydrophobic when incorporated into a 
polypeptide, but glycine's lack of an a-carbon allows the 
phi and psi angles of rotation (around the a-carbon) so 
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much conformational freedom that glycinyl residues can 
trigger changes in conformation or secondary structure that 
do not often occur when the other amino acids are 
substituted for each other. Other groups of amino acids 
5 frequently suitably substituted for each other include, but 
are not limited to, the group consisting of glutamic and 
aspartic acids; the group consisting of phenylalanine, 
tyrosine and tryptophan; and the group consisting of 
serine, threonine and, optionally, tyrosine. Additionally, 

10 the skilled artisan can readily group synthetic amino acids 
with naturally occurring amino acids. 

In the context of the present invention, a polypeptide 
is "substantially identical" to another polypeptide if it 
comprises at least about 80% identical amino acids. 

15 Desirably, at least about 50% of the non- identical amino 
acids are conservative or neutral substitutions. Also, 
desirably, the polypeptides differ in length (i.e., due to 
deletion mutations) by no more than about 10%. 

In a first embodiment, the present invention provides 

2 0 a polypeptide comprising the amino acid sequence YDIXYYXXE 
(SEQ ID NO: 1) , wherein X is any synthetic or naturally 
occurring amino acid residue, and the polypeptide comprises 
less than about 100 contiguous amino acids, preferably less 
than about 50 amino acids, more preferably less than about 

2 5 2 5 amino acids, and yet more preferably less than about 13 

amino acids that are identical to, or, in the alternative, 
substantially identical to, the amino acid sequence of the 
human CCR5 chemokine receptor. 

Preferably, the polypeptide of the first embodiment 

3 0 comprises YDIXYYXXE (SEQ ID NO: 1) , wherein the amino 

moiety of the amino- terminal tyrosinyl residue is not bound 
to another amino acid residue via a peptidic bond, and the 
carboxyl moiety of the glutamyl residue is not bound to 
another amino acid residue via a peptidic bond. However, 
35 the polypeptide can consist essentially of YDIXYYXXE (SEQ 
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ID NO: 1) and, optionally, can be modified by one or more 
pharmaceutical ly acceptable substituents, such as, for 
example, t-boc or a saccharide. 

More particularly, the polypeptide comprises the amino 
5 acid sequence YDIN*YYT*S*E (SEQ ID NO: 3) . Preferably, N* 
is asparaginyl, T* is threoninyl, and S* is serinyl . 

The polypeptide of the first embodiment can comprise a 
dodecapeptide selected from the amino acid sequence 
M*D*YQ*V*S*SP*IYDIN*YYT*S*E (SEQ ID NO: 5) . More 
10 preferably, the polypeptide of the first embodiment 

comprises the amino acid sequence MDYQVSSPIYDINYYTSE (SEQ 
ID NO: 5) . 

In a second embodiment, the present invention provides 
a polypeptide comprising the amino acid sequence 

15 XEXIXIYXXXNYXXX (SEQ ID NO: 6), wherein X is any synthetic 
or naturally occurring amino acid, and the polypeptide 
comprises less than about 100 contiguous amino acids, 
preferably less than about 50 amino acids, and more 
preferably less than about 2 5 amino acids, that are 

20 identical to or substantially identical to the amino acid 
sequence of the human CXCR4 chemokine receptor. 
Optionally, the polypeptide consists essentially of, or 
consists of, the sequence EXIXIYXXXNY (SEQ ID NO: 7) . 

In a preferred polypeptide of this second embodiment, 

2 5 the polypeptide comprises the amino acid sequence 

M*EG*IS*IYT*S*D*NYT*E*E* . Preferably, 
M*EG*IS*IYT*S*D*NYT*E*E* is M*EGISIYTSDNYT*E*E* . 

In a third embodiment, the present invention provides 
a polypeptide comprising the amino acid sequence EHQAFLQFS, 
30 wherein the polypeptide comprises less than about 100 

contiguous amino acid residues, preferably less than about 
50 contiguous amino acid residues, more preferably less 
than about 2 5 contiguous amino acid residues, that are 
identical to or substantially identical to the amino acid 

3 5 sequence of the human STRL3 3 chemokine receptor. The 



polypeptide can consist essentially of, or consist of, the 
sequence EHQAFLQFS . 

The first three embodiments of the present invention 
provide, among other things, polypeptides having 
5 substantial identity or identity to the amino- terminal 

regions of the chemokine receptors CCR5, CXCR4 , and STRL33 . 
These first three embodiments form a first group of 
embodiments of the present invention. The present 
invention also provides, in a second group of embodiments, 

10 polypeptides having substantial identity or identity to an 
internal region of the human chemokine receptors CCR5, 
CXCR4, and STRL33, as well as to the leukocyte cell -surface 
protein CD4 . 

This second group of embodiments provides a 

15 polypeptide that binds with HIV gpl2 0 under physiological 

conditions and comprises at least a portion of or all of an 
amino acid sequence selected from the group consisting of 
LPPLYSLVFI FGFVGNML (SEQ ID NO: 11) , QWDFGNTMCQLLTGLYFIGFFS 
(SEQ ID NO: 12), SQYQFWKNFQTLKIVILG (SEQ ID NO: 13), 

2 0 APYNIVLLLNTFQEFFGLNNCS (SEQ ID NO: 14), and 

YAFVGEKFRNYLLVFFQK (SEQ ID NO: 15), wherein the polypeptide 
comprises less than about 100 amino acids that are 
identical to or substantially identical to the amino acid 
sequence of the human CCR5 chemokine receptor; or selected 
25 from the group consisting of LLLT I PDF I FANVS EADD (SEQ ID NO: 
16) (165-182), WFQFQHIMVGLILPGIV (SEQ ID NO : 17) (197- 
214), and I DS F I LLE 1 1 KQGCEFEN (SEQ ID NO: 18) (261-278), 
wherein the polypeptide comprises less than about 100 amino 
acids that are identical to or substantially identical to 

3 0 the amino acid sequence of the human CXCR4 chemokine 

receptor; or 

selected from the group consisting of 
LVI S I FYHKLQSLTDVFL (SEQ ID NO: 19) (53-70), 
PFWAYAGIHEWVFGQVMC (SEQ ID NO: 20) (85-102), 
35 EAI STWLATQMTLGFFL (SEQ ID NO: 21) (185-202), 
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LTMI VCYSVI IKTLLHAG (SEQ ID NO : 22) (205-222), 
MAVFLLTQMPFNLMKFIRSTHW (SEQ ID NO: 23) (237-258), 
HWE Y YAMTS FH YT I MVTE (SEQ ID NO: 24) (257-274), 
ACLNPVLYAFVSLKFRKN (SEQ ID NO: 25) (281-298) and 
5 SKTFSASHNVEATSMFQL (SEQ ID NO: 26) (325-342), wherein the 
polypeptide comprises less than about 100 amino acids that 
are identical to a substantially identical to the amino 
acid sequence of the human STRL3 3 chemokine receptor; or 
selected from the group consisting of DTYICEVED (SEQ 

10 ID NO: 27), E E VQLL VFGLT ANS D (SEQ ID NO : 28), 

THLLQGQSLTLTLES (SEQ ID NO: 29) , and GEQVEFSFPLAFTVE (SEQ 
ID NO: 30) , wherein the polypeptide binds with HIV gpl20 
under physiological conditions and comprises less than 
about 100 amino acids that are identical to or 

15 substantially identical to the amino acid sequence of the 
human CD4 cell -surface protein. Optionally, the recited 
amino acid sequences can comprise 1 to about 6 conservative 
or neutral amino acid substitutions. 

The polypeptides of this second group of embodiments 

2 0 preferably comprise less than about 50 amino acid residues, 

and more preferably less than about 25 amino acid residues, 
and yet more preferably no additional amino acid residues, 
that are identical to a protein that naturally has the 
recited amino acid sequence. The polypeptide can be 
25 alternatively characterized by an absence of a region, 

outside the above-recited amino acid sequences, that has 
about five, or about ten, contiguous amino acid residues 
that have a sequence that consists of an amino identical 
and conservatively substituted residues as an amino acid 

3 0 sequence of the protein to which the polypeptide of the 

compound has homology. 

Any embodiment of the present inventive polypeptide 
can also comprise a pharmaceutically acceptable 
substituent, attachment of which is within the skill in the 
35 art. The pharmaceutically acceptability of substituents 
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are understood by those skilled in the art. For example, a 
pharmaceutical ly acceptable substituent can be a 
biopolymer, such as a polypeptide, an RNA, a DNA, or a 
polysaccharide. Suitable polypeptides comprise fusion 
5 proteins, an antibody or fragment thereof, a cell adhesion 
molecule or a fragment thereof, or a peptide hormone. 
Suitable polysaccharides comprise polyglucose moieties, 
such as starch and their derivatives, such as heparin. The 
pharmaceutical ly acceptable substituent also can be any 

10 suitable lipid or lipid-containing moiety, such as a lipid 
of a liposome or a vesicle, or even a lipophilic moiety, 
such as a prostaglandin, a steroid hormone, or a derivative 
thereof. Additionally, the pharmaceutical ly acceptable 
substituent can be a nucleotide or nucleoside, such as 

15 nicotine adenine dinucleotide or thymine, an amino acid 
residue, a saccharide or disaccharide , or the residue of 
another biomolecule naturally occurring in a cell, such as 
inositol, a vitamin, such as vitamin C, thiamine, or 
nicotinic acid. Synthetic organic moieties also can be 

2 0 pharmaceutically acceptable substituents , such as t -butyl 
carbonyl , an acetyl moiety, quinine, polystyrene and other 
biologically acceptable polymers. Optionally, a 
pharmaceutically acceptable substituent can be selected 
from the group consisting of a Ci-Ci 8 alkyl, a C 2 -Ci 8 

25 alkenyl, a C 2 -Ci 8 alkynyl, a C 6 -Ci 8 aryl , a C 7 -Ci 8 alkaryl , a 
C 7 -Ci 8 aralkyl, and a C 3 -Ci 8 cycloalkyl, wherein any of the 
foregoing moieties that are cyclic comprise from 0 to 2 
atoms per carbocyclic ring, which can be the same or 
different, and are selected from the group consisting of 

30 nitrogen, oxygen, and sulfur. 

Any of the substituents from this group can be 
substituted by one to six substituent moieties, which can 
be the same or different, selected from the group 
consisting of an amino moiety, a carbamate moiety, a 

35 carbonate moiety, hydroxyl , a phosphamate moiety, a 
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phosphate moiety, a phosphonate moiety, a pyrophosphate 
moiety, a triphosphate moiety, a sulfamate moiety, a 
sulfate moiety, a sulfonate moiety, a Ci-C 8 monoalkyl amine 
moiety, a Ci-C 8 dialkylamine moiety, and a Ci-C 8 
5 trialkylamine moiety. 

Any embodiment of the present inventive polypeptide 
can be encoded by a nucleic acid and can be expressed in a 
cell. The skilled artisan will recognize that the encoded 
polypeptide as well as any pharmaceutically acceptable 

10 substituent to be incorporated into the polypeptide, e.g., 
a formyl or acetyl substituent on an amino- terminal 
methionine or a saccharide, will preferably be produced by 
a cell that can express the polypeptide of the present 
invention. Accordingly, the amino acids incorporated into 

15 the polypeptide encoded by the nucleic acid are preferably 
naturally occurring. 

A nucleic acid as described above can be cloned into 
any suitable vector and can be used to transduce, 
transform, or transfect any suitable host. The selection 

2 0 of vectors and methods to construct them are commonly known 

to persons of ordinary skill in the art and are described 
in general technical references (see, in general, 
"Recombinant DNA Part D, " Methods in Enzymology, Vol. 153, 
Wu and Grossman, eds . , Academic Press (1987)). Desirably, 
25 the vector comprises regulatory sequences, such as 

transcription and translation initiation and termination 
codons, which are specific to the type of host (e.g., 
bacterium, fungus, plant, or animal) into which the vector 
is to be inserted, as appropriate and taking into 

3 0 consideration whether the vector is DNA or RNA. 

Preferably, the vector comprises regulatory sequences that 
are specific to the genus of the host. Most preferably, 
the vector comprises regulatory sequences that are specific 
to the species of the host and is optimized for the 
35 expression of an above-described polypeptide. 
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Constructs of vectors, which are circular or linear, 
can be prepared to contain an entire nucleic acid sequence 
as described above or a portion thereof ligated to a 
replication system that is functional in a prokaryotic or 
5 eukaryotic host cell. Replication systems can be derived 
from ColEl, 2 mja plasmid, X, SV40, bovine papilloma virus, 
and the like . 

Suitable vectors include those designed for 
propagation and expansion, or for expression, or both. A 

10 preferred cloning vector is selected from the group 
consisting of the pUC series, the pBluescript series 
(Stratagene, LaJolla, CA) , the pET series (Novagen, 
Madison, WI) , the pGEX series (Pharmacia Biotech, Uppsala, 
Sweden) , and the pEX series (Clonetech, Palo Alto, CA) . 

15 Examples of animal expression vectors include pEUK-Cl, pMAM 
and pMAMneo (Clonetech, Palo Alto, CA) . 

An expression vector can comprise a native or 
nonnative promoter operably linked to a nucleic acid 
molecule encoding an above-described polypeptide. The 

20 selection of promoters, e.g., strong, weak, inducible, 

tissue-specific and developmental -specif ic , is within the 
skill in the art. Similarly, the combining of a nucleic 
acid molecule as described above with a promoter is also 
within the skill in the art. 

25 The skilled artisan will also recognize that the 

polypeptide has ability to bind the gpl20 protein, which is 
most often found outside of cells. Accordingly, the 
present inventive nucleic acid advantageously can comprise 
a nucleic acid sequence that encodes a signal sequence such 

3 0 that a signal sequence is translated as a fusion protein 
with the polypeptide of the present inventive polypeptide 
to form a signal sequence-polypept ide fusion. The signal 
sequence can cause secretion of the entire polypeptide, 
including the signal sequence (which is a pharmaceutically 

35 acceptable substituent) , or can be cleaved from the 
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polypeptide (i.e., the polypeptide of the compound) prior 
to, or during, secretion so that at least the present 
inventive polypeptide is secreted out of a cell in which 
the nucleic acid is expressed. 
5 Alternatively, the nucleic acid comprises or encodes 

an antisense nucleic acid molecule or a ribozyme that is 
specific for a specified amino acid sequence of an above- 
described polypeptide. A nucleic acid sequence introduced 
in antisense suppression generally is substantially 

10 identical to at least a portion of the endogenous gene or 

gene to be repressed, but need not be identical. Thus, the 
vectors can be designed such that the inhibitory effect 
applies to other proteins within a family of genes 
exhibiting homology or substantial homology to the target 

15 gene. The introduced sequence also need not be full-length 
relative to either of the primary transcription product or 
the fully processed mRNA. Generally, higher homology can 
be used to compensate for the use of a shorter sequence. 
Furthermore, the introduced sequence need not have the same 

2 0 intron or exon pattern, and homology of non-coding segments 
will be equally effective. 

Ribozymes also have been reported to have use as a 
means to inhibit expression of endogenous genes. It is 
possible to design ribozymes that specifically pair with 

2 5 virtually any target RNA and cleave the phosphodiester 

backbone at a specific location, thereby functionally 
inactivating the target RNA. In carrying out this 
cleavage, the ribozyme is not itself altered and is, thus, 
capable of recycling and cleaving other molecules, making 

3 0 it a true enzyme. The inclusion of ribozyme sequences 

within antisense RNAs confers RNA-cleaving activity upon 
them, thereby increasing the activity of the constructs. 
The design and use of target RNA-specific ribozymes is 
described in Haseloff et al . , Nature 334: 585-591 (1988). 



Further provided by the present invention is a 
composition comprising an above -described polypeptide or 
nucleic acid and a carrier therefor. Another composition 
provided by the present invention is a composition 
5 comprising an antibody to an above -described polypeptide or 
an anti-antibody to an above -described polypeptide. 

Any embodiment of the present invention including the 
present inventive polypeptide, nucleic acid, antibody, and 
ant i- antibody, can be incorporated into a composition 

10 comprising a carrier. The carrier can serve any function. 
For example, the carrier can increase the solubility of the 
present inventive polypeptide, nucleic acid or antibody in 
aqueous solutions. Additionally, the carrier can protect 
the present inventive polypeptide, nucleic acid or antibody 

15 from environmental insults, such as dehydration, oxidation, 
and photolysis. Moreover, the carrier can serve as an 
adjuvant, or as a timed-release control means in a 
biological system. 

Antibodies can be generated in accordance with methods 

2 0 known in the art. See, for example, Benjamin, In 

Immunology: a short course, Wiley-Liss, NY, 1996, pp. 436- 
437; Kuby, In Immunology, 3rd. ed., Freeman, NY, 1997, pp. 
455-456; Greenspan et al . , FASEB J. 7: 437-443 (1993); and 
Poskitt, Vaccine 9: 792-796 (1991). Ant i -ant ibodies (i.e., 

25 anti-idiotypic antibodies) also can be generated in 

accordance with methods known in the art (see, for example, 
Benjamin, In Immunology: a short course, Wiley-Liss, NY, 
1996, pp. 436-437; Kuby, In Immunology, 3rd. ed. , Freeman, 
NY, 1997, pp. 455-456; Greenspan et al . , FASEB J . , 7, 437- 

30 443, 1993; Poskitt, Vaccine , 9, 792-796, 1991; and 

Madiyalakan et al . , Hybridonor 14: 199-203 (1995) ("Anti- 
idiotype induction therapy")). Such antibodies can be 
obtained and employed either in solution-phase or coupled 
to a desired solid-phase matrix. Having in hand such 

35 antibodies, one skilled in the art will further appreciate 
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that such antibodies, using well-established procedures 
(e.g., such as described by Harlow and Lane (1988, supra ) , 
are useful in the detection, quantification, or 
purification of gpl20 or HIV, particularly HIV-1, 
5 conjugates of each and host cells transformed to produce a 
gpl20 receptor or a derivative thereof. Such antibodies 
are also useful in a method of prevention or treatment of a 
viral infection and in a method of inducing an immune 
response to HIV as provided herein. 

10 In view of the above, an above -described polypeptide 

can be administered to an animal. The animal generates 
ant i -polypeptide antibodies. Among the ant i -polypeptide 
antibodies generated or induced in the animal are 
antibodies that have an internal image of gpl20. In 

15 accordance with well-known methods, polyclonal or 

monoclonal antibodies can be obtained, isolated and 
selected. Selection of an ant i -polypeptide antibody that 
has an internal image of gpl2 0 can be based upon 
competition between the ant i -polypeptide antibody and gpl20 

2 0 for binding to an above-described polypeptide, or upon the 
ability of the ant i -polypeptide antibody to bind to a free 
polypeptide as opposed to a polypeptide bound to gpl20. 
Such an ant i -antibody can be administered to an animal to 
prevent or treat an HIV infection in accordance with 

2 5 methods provided herein. 

Although nonhuman anti- idiotypic antibodies, such as 
an anti -polypeptide antibody that has an internal image of 
gpl20 and, therefore, is anti-idiotypic to gpl20, are 
useful for prophylaxis in humans, their favorable 

3 0 properties might, in certain instances, can be further 

enhanced and/or their adverse properties further 
diminished, through "humanization" strategies, such as 
those recently reviewed by Vaughan, Nature Biotech. , 16, 
535-539, 1998. 



18 

Prior to administration to an animal, such as a 
mammal, in particular a human, an above -described 
polypeptide, nucleic acid, antibody or anti -antibody can be 
formulated into various compositions by combination with 
5 appropriate carriers, in particular, pharmaceutically 

acceptable carriers or diluents, and can be formulated to 
be appropriate for either human or veterinary applications. 

The present invention also provides a method of making 
an antibody. The method comprises administering an 

10 immunogenic amount of an above -described polypeptide or 

nucleic acid to an animal, such as a mammal, in particular 
a human. Determining the quantity of a polypeptide or 
nucleic acid that is immunogenic will depend in part on the 
degree of similarity to a protein or other molecule of the 

15 inoculated animal, the route of administration of the 
polypeptide or nucleic acid, and the size of the 
polypeptide administered or encoded by the administered 
nucleic acid. If necessary, the polypeptide or nucleic 
acid can be mixed with or ligated to a substance (or an 

2 0 adjuvant) that enhances its immunogenicity . Such 

calculations and procedures are within the skill of the 
ordinary artisan. Additionally, the present inventive 
method preferably can be used to induce an immune response 
against HIV, particularly HIV-1, in a mammal, particularly 

2 5 a human. 

In view of the above, the present invention further 
provides a method of prophylactically or therapeutically 
treating an HIV infection in a mammal, particularly a 
human, in need thereof. The method comprises administering 

30 to the mammal an HIV replication-inhibiting effective 

amount of an above -described polypeptide, nucleic acid, or 
an anti -antibody to an above -described polypeptide or a 
nucleic acid encoding such a polypeptide. 

The present invention also provides a method of 

35 prophylactically or therapeutically treating HIV infection 
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in a mammal. The method comprises administering to the 
mammal an effective amount of an above-described 
polypeptide or nucleic acid. Prior to administration to an 
animal, such as a mammal, in particular a human, an above- 
5 described polypeptide or nucleic acid can be formulated 
into various compositions by combination with appropriate 
carriers, in particular, pharmaceutically acceptable 
carriers or diluents, and can be formulated to be 
appropriate for either human or veterinary applications. 

10 Thus, a composition for use in the method of the 

present invention can comprise one or more of the 
polypeptides, nucleic acids, antibodies or anti-antibodies 
described herein, preferably in combination with a 
pharmaceutically acceptable carrier . Pharmaceutically 

15 acceptable carriers are well-known to those skilled in the 
art, as are suitable methods of administration. The choice 
of carrier will be determined, in part, by whether a 
polypeptide or a nucleic acid is to be administered, as 
well as by the particular method used to administer the 

20 composition. Optionally, the carrier can be selected to 
increase the solubility of the composition or mixture, 
e.g., a liposome or polysaccharide. One skilled in the art 
will also appreciate that various routes of administering a 
composition are available, and, although more than one 

25 route can be used for administration, a particular route 
can provide a more immediate and more effective reaction 
than another route. Accordingly, there are a wide variety 
of suitable formulations of compositions that can be used 
in the present inventive methods. 

3 0 A composition in accordance with the present 

invention, alone or in further combination with one or more 
other active agents, can be made into a formulation 
suitable for parenteral administration, preferably 
intraperitoneal administration. Such a formulation can 

3 5 include aqueous and nonaqueous, isotonic sterile injection 
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solutions, which can contain antioxidants, buffers, 
bacteriostats , and solutes that render the formulation 
isotonic with the blood of the intended recipient, and 
aqueous and nonaqueous sterile suspensions that can include 
5 suspending agents, solubilizers , thickening agents, 

stabilizers, and preservatives. The formulations can be 
presented in unit dose or mult i -dose sealed containers, 
such as ampules and vials, and can be stored in a freeze- 
dried (lyophilized) condition requiring only the addition 
10 of the sterile liquid carrier, for example, water, for 
injections, immediately prior to use. Extemporaneously 
injectable solutions and suspensions can be prepared from 
sterile powders, granules, and tablets, as described 
herein. 

15 A formulation suitable for oral administration can 

consist of liquid solutions, such as an effective amount of 
the compound dissolved in diluents, such as water, saline, 
or fruit juice; capsules, sachets or tablets, each 
containing a predetermined amount of the active ingredient, 

20 as solid or granules; solutions or suspensions in an 

aqueous liquid; and oil-in-water emulsions or water-in-oil 
emulsions. Tablet forms can include one or more of 
lactose, mannitol, corn starch, potato starch, 
microcrystalline cellulose, acacia, gelatin, colloidal 

25 silicon dioxide, croscarmellose sodium, talc, magnesium 
stearate, stearic acid, and other excipients, colorants, 
diluents, buffering agents, moistening agents, 
preservatives, flavoring agents, and pharmacologically 
compatible carriers . 

30 Similarly, a formulation suitable for oral 

administration can include lozenge forms, which can 
comprise the active ingredient in a flavor, usually sucrose 
and acacia or tragacanth; pastilles comprising the active 
ingredient in an inert base, such as gelatin and glycerin, 

35 or sucrose and acacia; and mouthwashes comprising the 



active ingredient in a suitable liquid carrier; as well as 
creams, emulsions, gels, and the like containing, in 
addition to the active ingredient, such carriers as are 
known in the art . 
5 An aerosol formulation suitable for administration via 

inhalation also can be made. The aerosol formulation can 
be placed into a pressurized acceptable propellant, such as 
dichlorodif luoromethane , propane, nitrogen, and the like. 

A formulation suitable for topical application can be 

10 in the form of creams, ointments, or lotions. 

A formulation for rectal administration can be 
presented as a suppository with a suitable base comprising, 
for example, cocoa butter or a salicylate. A formulation 
suitable for vaginal administration can be presented as a 

15 pessary, tampon, cream, gel, paste, foam, or spray formula 
containing, in addition to the active ingredient, such 
carriers as are known in the art to be appropriate . 

Important general considerations for design of 
delivery systems and compositions, and for routes of 

20 administration, for polypeptide drugs also apply (Eppstein, 
CRC Crit. Rev. Therapeutic Drug Carrier Systems 5, 99-139, 
1988; Siddiqui et al . , CRC Crit. Rev. Therapeutic Drug 
Carrier Systems 3, 195-208, 1987); Banga et al . , Int. J. 
Pharmaceutics 48, 15-50, 1988; Sanders, Eur. J. Drug Me tab. 

25 Pharmacokinetics 15, 95-102, 1990; Verhoef, Eur . J . Drug 

Metab . Pharmacokinetics 15, 83-93, 1990). The appropriate 
delivery system for a given polypeptide will depend upon 
its particular nature, the particular clinical application, 
and the site of drug action. As with any protein drug, 

30 oral delivery will likely present special problems, due 

primarily to instability in the gastrointestinal tract and 
poor absorption and bioavailability of intact, bioactive 
drug therefrom. Therefore, especially in the case of oral 
delivery, but also possibly in conjunction with other 

35 routes of delivery, it will be necessary to use an 
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absorption-enhancing agent in combination with a given 
polypeptide. A wide variety of absorption-enhancing agents 
have been investigated and/or applied in combination with 
protein drugs for oral delivery and for delivery by other 
5 routes (Verhoef, 1990, supra ,- van Hoogdalem, Pharmac . Ther . 
44, 407-43, 1989; Davis, J. Pharm. Pharmacol. 44(Suppl. 1), 
186-90, 1992). Most commonly, typical enhancers fall into 
the general categories of (a) chelators, such as EDTA, 
salicylates, and N-acyl derivatives of collagen, (b) 
10 surfactants, such as lauryl sulfate and polyoxyethylene-9- 
lauryl ether, (c) bile salts, such as glycholate and 
taurocholate , and derivatives, such as 

taurodihydrof usidate, (d) fatty acids, such as oleic acid 
and capric acid, and their derivatives, such as 

15 acylcarnitines , monoglycerides , and diglycerides , (e) non- 
surfactants, such as unsaturated cyclic ureas, (f) 
saponins, (g) cyclodextrins , and (h) phospholipids. 

Other approaches to enhancing oral delivery of protein 
drugs can include the aforementioned chemical modifications 

20 to enhance stability to gastrointestinal enzymes and/or 

increased lipophilicity . Alternatively, the protein drug 
can be administered in combination with other drugs or 
substances that directly inhibit proteases and/or other 
potential sources of enzymatic degradation of proteins. 

25 Yet another alternative approach to prevent or delay 
gastrointestinal absorption of protein drugs is to 
incorporate them into a delivery system that is designed to 
protect the protein from contact with the proteolytic 
enzymes in the intestinal lumen and to release the intact 

3 0 protein only upon reaching an area favorable for its 

absorption. A more specific example of this strategy is 
the use of biodegradable microcapsules or microspheres, 
both to protect vulnerable drugs from degradation, as well 
as to effect a prolonged release of active drug (Deasy, in 

35 Microencapsulation and Related Processes , Swarbrick, ed. , 
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Marcell Dekker, Inc.: New York, 1984, pp. 1-60, 88-89, 208- 
11) . Microcapsules also can provide a useful way to effect 
a prolonged delivery of a protein drug after injection 
(Maulding, J. Controlled Release 6, 167-76, 1987) . 
5 The dose administered to an animal, such as a mammal, 

particularly a human, in the context of the present 
invention should be sufficient to effect a therapeutic or 
prophylactic response in the individual over a reasonable 
time frame. The dose will be determined by the particular 

10 polypeptide, nucleic acid, antibody, or anti -ant ibody 

administered, the severity of any existing disease state, 
as well as the body weight and age of the individual. The 
size of the dose also will be determined by the existence 
of any adverse side effects that may accompany the use of 

15 the particular polypeptide, nucleic acid, antibody or anti- 
antibody employed. It is always desirable, whenever 
possible, to keep adverse side effects to a minimum. 

The dosage can be in unit dosage form, such as a 
tablet or capsule. The term "unit dosage form" as used 

20 herein refers to physically discrete units suitable as 

unitary dosages for human and animal subjects, each unit 
containing a predetermined quantity of a vector, alone or 
in combination with other active agents, calculated in an 
amount sufficient to produce the desired effect in 

25 association with a pharmaceutically acceptable diluent, 
carrier, or vehicle. The specifications for the unit 
dosage forms of the present invention depend on the 
particular embodiment employed and the effect to be 
achieved, as well as the pharmacodynamics associated with 

30 each polypeptide, nucleic acid or anti -antibody in the 

host. The dose administered should be an "HIV infection 
inhibiting amount" of an above-described polypeptide or 
nucleic acid or an "immune response -inducing effective 
amount" of an above-described polypeptide, an above- 

35 described nucleic acid, or an antibody as appropriate. 
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Another composition provided by the present invention 
is a composition comprising a solid support matrix to which 
is attached an above-described polypeptide, or an anti- 
antibody to an above-described polypeptide. The solid 
5 matrix can comprise other functional reagents including, 

for example, polyethylene glycol, dextran, albumin and the 
like, whose intended effector functions may include one or 
more of the following: to improve stability of the 
conjugate; to increase the half-life of the conjugate; to 

10 increase resistance of the conjugate to proteolysis; to 

decrease the immunogenicity of the conjugate; to provide a 
means to attach or immobilize a functional polypeptide or 
anti-antibody onto a solid support matrix (e.g., see, for 
example, Harris, in Poly (Ethylene Glycol) Chemistry: 

15 Biotechnical and Biomedical Applications , Harris, ed. , 
Plenum Press: New York (1992), pp. 1-14) . Conjugates 
furthermore may comprise a polypeptide or ant i- antibody 
coupled to an effector molecule, each of which, optionally, 
may have different functions (e.g., such as a toxin 

2 0 molecule (or an immunological reagent) and a polyethylene 
glycol (or dextran or albumin) molecule) . Diverse 
applications and uses of functional proteins and 
polypeptides, attached to or immobilized on a solid support 
matrix, are exemplified more specifically for poly (ethylene 

2 5 glycol) conjugated proteins or peptides in a review by 
Holmberg et al . (In Poly (Ethylene Glycol) Chemistry: 
Biotechnical and Biomedical Applications, Harris, ed., 
Plenum Press: New York, 1992, pp. 303-324) . 

In addition, the present invention provides a method 

30 of removing HIV from a bodily fluid of an animal. The 
method comprises extracorporeally contacting the bodily 
fluid of the animal with a solid-support matrix to which is 
attached an above-described polypeptide or an anti-antibody 
to an above-described polypeptide. Alternatively, the 

35 bodily fluid can be contacted with the polypeptide or anti- 
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antibody in solution and then the solution can be contacted 
with a solid support matrix to which is attached a means to 
remove the polypeptide or ant i- antibody to which is bound 
HIV gpl20 from the bodily fluid. 
5 Methods of attaching an herein-described polypeptide, 

or an anti-antibody to a solid support matrix are known in 
the art. "Attached" is used herein to refer to attachment 
to (or coupling to) and immobilization in or on a solid 
support matrix. See, for example, Harris, in Poly (Ethylene 

10 Glycol) Chemistry: Biotechnical and Biomedical 

Applications , Harris, ed. , Plenum Press: New York (1992), 
pp. 1-14) and international patent application WO 91/02714 
(Saxinger) . Diverse applications and uses of functional 
polypeptides attached to or immobilized on a solid support 

15 matrix are exemplified more specifically for poly (ethylene 
glycol) conjugated proteins or peptides in a review by 
Holmberg et al . (In Poly (Ethylene Glycol) Chemistry: 
Biotechnical and Biomedical Applications , Harris, ed., 
Plenum Press: New York, 1992, pp. 303-324). 

2 0 The present invention also provides a method of making 

an antibody that binds to gpl2 0 of HIV under physiological 
conditions. The method comprises labeling an embodiment of 
the present inventive compound to obtain a labeled 
compound. Labeling compounds are within the skill of the 
25 ordinary artisan. For example, the present inventive 

compound can be labeled with radioactive atom, such as 125 I 
in the same or a similar manner as was performed in the 
examples provided below. Alternatively, an enzyme, such as 
horseradish peroxidase, can be attached to or incorporated 

3 0 into the present inventive compound. Then by exposing a 

chromogenic or photogenic compound to the compound, a 
signal indicative of the presence and quantity of the 
compound present can be generated. In another alternative, 
a polyhistidinyl moiety can be attached to, or incorporated 
35 into, the present inventive moiety so that the present 
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inventive compound will react with high affinity to 
transition metal ions such as nickel, copper, or zinc ions; 
this reaction can be used as the basis to quantify the 
amount of the present inventive compound present at a 
5 particular location. In yet another alternative, the 
present inventive compound can be used as antigen to a 
standard antibody that specifically recognizes an antigenic 
epitope of the present inventive compound. As is well- 
known, the standard antibody can itself be labeled or used 

10 in conjunction with an additional antibody that is labeled 
with an enzyme, radioisotope, or other suitable means. The 
skilled artisan will recognize that there is a plethora of 
other suitable means and methods to label the present 
inventive compound . 

15 This present inventive method of making an antibody 

that binds to a gpl2 0 envelope protein of HIV further 
comprises providing a library of synthetic peptides. The 
library consists of a multiplicity of synthetically- 
produced polypeptides that are homologous, and preferably 

20 essentially identical (i.e., having the same primary amino 
acid residue sequence, ignoring blocking groups, 
phosphorylation of serinyl, threoninyl, and tyrosinyl 
residues, hydroxylat ion of prolinyl residues, and the like) 
or identical, to a continuous region of an HIV gpl20 

25 envelope protein. The polypeptides of the library can be 
any suitable length. While larger regions allow faster 
scanning and tend to preserve non- linear epitopes, shorter 
length polypeptides allow more sensitive screening of the 
primary sequence of the gpl2 0 protein. However, 

30 polypeptides that are too short can lose essential 

secondary structure or cleave reactive sites into one or 
more pieces. Preferably, a mixture of short and long 
polypeptides are incorporated into the library, however, 
the library can consist of polypeptides of a single length 

3 5 (measured in amino acid residues) . For the sake of 
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convenience the library can be split into multiple parts, 
and screened by parts. Typically, the polypeptides of the 
library will be between about 6 and about 4 5 amino acid 
residues in length. 
5 Typically, the library will comprise a series of 

polypeptides each having an identical sequence to that of 
gpl2 0 but having an amino- terminus a particular number of 
amino acids downstream of the amino- terminus of the prior 
polypeptide (see, examples section below) . The distance, 

10 measured in amino acid residues, is referred to as the 

offset. Preferably, libraries that are characterized by 
the existence of an offset, the offset is not greater than 
the product of length of the longest polypeptide measured 
in amino acid residues and 1.5, preferably 1.0, and more 

15 preferably 0.5. The library can be alternatively 

characterized by the existence of an offset not greater 
than 30, preferably 15, and more preferably 4. 

Each polypeptide of the library is substantially 
isolated from every other polypeptide of said library and 

2 0 is located in a known position. For example, each 

polypeptide can be bound to a solid support and that is in 
a vessel or that can be placed in a vessel . The vessel 
preferably enables each polypeptide to be covered in a 
liquid that does not contact any other oligonucleotide of 

2 5 the library. By way of example, each polypeptide can be 

bound to a bead that is placed in a vessel (or tube) or can 
be bound to the well of a multi-well assay plate. 
Alternatively, an array of polypeptides can be fashioned, 
for example on a microchip device (as is presently used in 

3 0 some DNA sequencing devices and methods) , and the entire 

array can be bathed in a single solution. 

Each polypeptide is then individually contacted with 
the labeled compound such that a portion of the labeled 
compound can bind with the polypeptide of the library. In 
35 this way, a bound population of each labeled compound of 
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the present invention and an unbound population of the 
labeled compound is generated. The phrase individually 
contacted means that each polypeptide has the opportunity 
to bind with the labeled compound and the quantity of 
5 labeled compound bound by each can be determined. 

The method then comprises removing substantially all 
of the unbound labeled compound from the position occupied 
by each polypeptide. That is, the solution comprising the 
labeled compound is separated from the polypeptides of the 

10 library and the bound population of the labeled compound. 
This can be done by any suitable method, e.g., by 
aspiration and one or more washing steps comprising adding 
a quantity of liquid sufficient to cover all the surfaces 
that were contacted by the labeled compound and aspirating 

15 away substantially all of the wash liquid. 

The amount of labeled compound that remains 
co-localized with each polypeptide of the library is then 
measured to determine the quantity of labeled compound 
bound by each polypeptide. The amount of the present 

2 0 inventive compound bound by each polypeptide can be 

directly evaluated to identify a portion of the HIV gpl20 
envelope protein that binds to an (HIV) -receptor selected 
from the group consisting of CCR5, CXCR4 , STRL33, and CD4 . 
This information is then used to identify and provide an 
25 immunizing compound. The immunizing compound comprises a 
polypeptide comprising an amino acid sequence that is 
homologous to, or preferably is essentially identical to, 
or identical to, the portion of the HIV-1 gpl20 envelope 
protein that binds with CD4 , CCR5 , CXCR4 , and/or STRL3 3 . 

3 0 The immunizing protein can be provided by processing gpl2 0, 

e.g., proteolytically digesting gpl20 that has been 
isolated from a preparation of HIV-1. Preferably, however, 
the immunizing compound is prepared synthetically, or by 
genetic engineering, or by a combination of genetic 
35 engineering and synthetic methods. The immunizing compound 
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can comprise a pharmaceutical ly acceptable substituent, can 
be encoded by a nucleic acid that can be expressed in a 
cell, can be mixed with a carrier, and is an inventive 
aspect of the present invention. 
5 An immunogenic quantity of the immunizing compound is 

then inserted into an animal (e.g., a human, or a rodent, a 
canine, a feline, or a ruminant) in a manner consistent 
with the discussion of a method of raising an antibody to 
the present inventive compounds that are homologous to 

10 portions of CCR5, CXCR4 , STRL33, and CD4 , above. The 

insertion of the immunizing compound causes the inoculated 
animal to produce an antibody that binds with said portion 
of the HIV gpl2 0 envelope protein. Thus the present 
invention also provides an antibody that binds to an HIV 

15 gpl2 0 envelope protein, as well as an antigen binding 

protein comprising one or more complementarity determining 
regions of the antibody (e.g., a Fab, a Fab 2 -, an Fv, a 
single-chain antibody, a diabody, and humanized variants of 
all of the above, all of which are within the skill in the 

2 0 art) . 

The antibody or variant thereof is preferably useful 
in detecting or diagnosing the presence of HIV gpl2 0 
envelope protein, and thus HIV, in an animal. The antibody 
is also preferably prevents or attenuates infection of an 
25 animal exposed to HIV, to whom an effective quantity of the 
antibody or a variant thereof, has been administered or 
produced in response to inoculation with the immunizing 
compound. The antibody preferably also is useful in 
treating or preventing (i.e., inhibiting) HIV infection in 

3 0 an animal to whom a suitable dose has been administered or 

in which a suitable quantity of antibody has been produced. 
The antibody is also useful in the study of HIV infection 
of mammalian cells, the host range specificities of HIV 
infection, and preferably, the mechanism by which 
35 antibodies neutralize infectious viruses. 
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EXAMPLES 

The following examples further illustrate the present 
invention but, of course, should not be construed as 
5 limiting the scope of the claimed invention in any way. 

Synthetic peptide arrays were constructed in 96 -well 
microtiter plates in accordance with the method set forth 
in WO 91/02714 (Saxinger) , and used to test the binding of 
HIV-Ila! envelope gpl20 that had been labeled with 

10 radioactive iodine (radiolabeling by standard methods) . 

After incubating the radiolabeled gpl20 in a well with each 
synthetic peptide, a washing step was performed to remove 
unbound label, and the relative level of radioactivity 
remaining in each well of the plate was evaluated to 

15 determine the relative affinity of each peptide for the 

gp!20. The synthesis of the peptides and the quantity of 
binding between the synthetic peptides and the gpl2 0 were 
found to be suitably reproducible, precise, and sensitive. 
Initial screening of the entire primary sequence of the 

2 0 chemokine and CD4 receptor molecules was taken 18 amino 
acid residues at a time. 

The authenticity of the binding signals generated by 
this technique has been repeatedly demonstrated by showing 
that antibodies to CCR5 and CXCR4 are able to inhibit the 

25 binding of radiolabeled gpl2 0 to the polypeptides derived 
from CCR5 and CXCR4 that show a high affinity for binding 
with gpl20. Additionally, the accuracy of the binding 
assay used hereinbelow is demonstrated by Example 7. 



30 Example 1 

This example identifies segments of the CCR5 
co-receptor that bind with gpl20. 

The first column in the table below indicates the 
number of the amino acid in the wild- type CCR5 receptor. 
35 The second column explicitly identifies the peptide 
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sequence. The third column indicates the radioactive 
counts recorded in twenty minutes (i.e., the cpm x 20) 
after the background or non-specific counts had been 
subtracted. The fourth column contains an X in each row 
5 for which the listed polypeptide bound with high affinity 
to gpl20. The fifth and final column contains an X in each 
row wherein the listed sequence binds with substantial 
affinity but is weak in comparison to other samples, 
particularly adjacent samples. 
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SEQ SEG PEPTIDE Counts Peak Non-Peak SEQ 

per 20' Activity Activity ID 

NO: 



Average - 
background 





empty (control) 


7 








1- -18 


MDYQVS S P I YD I NY YT S E 


735 


X 
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5--22 


VSSPIYDINYYTSEPCQK 


383 




X 


32 


9--26 


IYDINYYTSEPCQKINVK 


228 




X 


33 


13-30 


NYYTSEPCQKINVKQIAA 


6 






34 


17-34 


SEPCQKINVKQIAARLLP 


-44 






35 


21-38 


QKINVKQIAARLLPPLYS 


20 






36 


25-42 


VKQ I AARLL P PL YS LVF I 


18 






37 


29-46 


AARLLPPLYSLVFIFGFV 


33 






38 


33-50 


LPPLYSLVFIFGFVGNML 


705 


X 




39 


37-54 


YS LVF I FGFVGNMLVI L I 


347 




X 


40 


41-58 


F I FGFVGNMLVI L I L I NC 


343 




X 


41 


45-62 


FVGNMLVI L I L INCKRLK 


62 






42 


49-66 


MLVILILINCKRLKSMTD 


84 






43 


53-70 


L I L I NCKRLKSMTD I YLL 


2 






44 


57-74 


NCKRLKSMTD I YLLNLAI 


25 






45 


61-78 


LKSMTDIYLLNLAISDLF 


210 






46 


65-82 


TDI YLLNLAI SDLFFLLT 


38 






47 


69-86 


LLNLAI SDLFFLLTVPFW 


144 






48 


73-90 


AI SDLFFLLTVPFWAHYA 


41 






49 


77-94 


LFFLLTVPFWAHYAAAQW 


173 






50 


81-98 


LTVPFWAHYAAAQWDFGN 


306 






51 


85- 


FWAHYAAAQWDFGNTMCQ 


212 






52 


89- 


YAAAQWDFGNTMCQLLTG 


494 




X 


53 


93- 


QWDFGNTMCQLLTGLYFI 


1019 


X 




54 


97- 


GNTMCQLLTGLYFIGFFS 


941 


X 




55 


101- 


CQLLTGLYFIGFFSGIFF 


489 




X 


56 


105- 


TGLYFIGFFSGIFFIILL 


80 






57 


109- 


FIGFFSGIFFI ILLTIDR 


76 






58 


113- 


FSGI FF I I LLT I DRYLAV 


83 






59 


117- 


F F 1 1 LLT I DRYLAWHAV 


77 






60 


121- 


LLTIDRYLAWHAVFALK 


31 






61 


125- 


DRYLAWHAVFALKARTV 


62 






62 


129- 


AWHAVFALKARTVTFGV 


34 






63 


133- 


AVFALKARTVTFGWTSV 


63 






64 


137- 


LKARTVTFGWTSVITWV 


74 






65 


141- 


TVTFGWTS VI TWWAVF 


-25 






66 


145- 


GWTSVI TWWAVFASLP 


69 






67 


149- 


SVITWWAVFASLPGIIF 


46 






68 


153- 


WWAVFASLPGI I FTRSQ 


87 






69 


157- 


VFASLPGI I FTRSQKEGL 


54 






70 


161- 


LPGI IFTRSQKEGLHYTC 


118 






71 


165- 


IFTRSQKEGLHYTCSSHF | 


98 






72 
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169- 


SQKEGLHYTCSSHFPYSQ 


304 




X 


73 


173- 


GLHYTCSSHFPYSQYQFW 


301 




X 


74 


177- 


TCSSHFPYSQYQFWKNFQ 


367 




X 


75 


181- 


HFPYSQYQFWKNFQTLKI 


1008 




X 


76 


185- 


SQYQFWKNFQTLKIVILG 


1572 


X 




77 


189- 


FWKNFQTLKIVILGLVLP 


40 






78 


193- 


FQTLKIVILGLVLPLLVM 


45 






79 


197- 


KI VI LGLVLPLLVMVI CY 


65 






80 


201- 


LGLVLPLLVMVI CYSGIL 


180 






81 


205- 


LPLLVMVI CYSGI LKTLL 


68 






82 


209- 


VMVI CYSGI LKTLLRCRN 


-8 






83 


213- 


CYSGI LKTLLRCRNEKKR 


70 






84 


217- 


I LKTLLRCRNEKKRHRAV 


19 






85 


221- 


LLRCRNEKKRHRAVRL I F 


102 






86 


225- 


RNE KKRHRAVRL I FT I M I 


23 






87 


229- 


KRHRAVRL I FT I M I VY FL 


36 






88 


233- 


AVRLIFTIMIVYFLFWAP 


62 






89 


237- 


IFTIMIVYFLFWAPYNIV 


121 






90 


241- 


MIVYFLFWAPYNIVLLLN 


214 






91 


245- 


FLFWAPYNIVLLLNTFQE 


616 




X 


92 


249- 


APYNIVLLLNTFQEFFGL 


1962 


X 




93 


253- 


IVLLLNTFQEFFGLNNCS 


2134 


X 




94 


257- 


LNTFQEFFGLNNCSSSNR 


293 




X 


95 


261- 


Q E F FGLNNC S S SNRLDQA 


63 






96 


265- 


GLNNCSS SNRLDQAMQVT 


-31 






97 


269- 


CSSSNRLDQAMQVTETLG 


90 






98 


273- 


NRLDQAMQVTETLGMTHC 


10 






99 


277- 


QAMQVTETLGMTHCCINP 


81 






100 


281- 


VTETLGMTHCC I NP I I YA 


15 






101 


285- 


LGMTHCC INP I I YAFVGE 


282 




X 


102 


289- 


HCCINPI IYAFVGEKFRN 


200 




X 


103 


293- 


NPI I YAFVGEKFRNYLLV 


162 




X 


104 


297- 


YAFVGEKFRNYLLVFFQK 


596 


X 




105 


301- 


GEKFRNYLLVFFQKHIAK 


69 






106 


305- 


RNYLLVFFQKHIAKRFCK 


65 






107 


309- 


LVFFQKHIAKRFCKCCSI 


76 






108 


313- 


QKH I AKRFCKCCS I FQQE 


23 






109 


317- 


AKRFCKCCS I FQQEAPER 


64 






110 


321- 


CKCCS I FQQEAPERASSV 


53 






111 


325- 


S I FQQEAPERAS SVYTRS 


100 






112 


329- 


QEAPERASSVYTRSTGEQ 


84 






113 


333- 


ERASSVYTRSTGEQEI SV 


84 






114 


337- 


SVYTRSTGEQE I SVGL 


47 






115 



These data indicate that, in addition to polypeptide 
sequences derived from positions 1-18 of the CCR5 receptor, 
the polypeptide sequences LPPLYSLVFIFGFVGNML (SEQ ID NO: 
5 11), QWDFGNTMCQLLTGLYF I GFFS (SEQ ID NO: 12), 
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SQYQFWKNFQTLKI VI LG (SEQ ID NO : 13), AP YN I VLLLNT FQE F FGLNNC S 
(SEQ ID NO: 14), and YAFVGEKFRNYLLVFFQK (SEQ ID NO: 15) 
comprise multiple subsequences, each which is capable of 
binding to HIV-1 envelope gpl20. 

5 

Example 2 

This example identifies segments of the CXCR4 
co-receptor that bind with gpl20. 

The first column in the table below indicates the 

10 number of the amino acid in the wild- type CXCR4 receptor. 
The second column explicitly identifies the peptide 
sequence. The third and fourth columns indicate the 
radioactive counts recorded in twenty minutes (i.e., the 
cpm x 20) after the background or non-specific counts had 

15 been subtracted. The fifth column contains an X in each 
row for which the listed polypeptide bound with high 
affinity to gpl20. The sixth and final column contains an 
X in each row wherein the listed sequence binds with 
substantial affinity but is weak in comparison to other 

20 samples, particularly adjacent samples. 
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nci j ui 

Activity 
Peak 


TJ\ -J t~i /—\ ~y~ 

Act ivi tv 
Peak 


ID 
NO: 




empty (control) 


412 


0 








1-18 


MEGIS I YTSDNYTEEMGS 


3003 


2591 


X 




116 


5--22 


SIYTSDNYTEEMGSGDYD 


483 


71 






117 


9--26 


SDNYTEEMGSGDYDSMKE 


455 


43 






118 


13-30 


TEEMGSGDYDSMKEPCFR 


453 


41 






119 


17-34 


GSGDYDSMKE PCFREENA 


384 


-28 






120 


21-38 


YDSMKEPCFREENANFNK 


465 


53 






121 


25-42 


KE PCFREENANFNKI FLP 


664 


252 






122 


29-46 


FREENANFNKI FLPTI YS 


463 


51 






123 


33-50 


NANFNKI FL PT I YS I I FL 


585 


173 






124 


37-54 


NKI FLPTI YS I I FLTGI V 


550 


138 






125 


41-58 


LPTIYSIIFLTGIVGNGL 


530 


118 






126 


45-62 


YS I I FLTGI VGNGLVI LV 


535 


123 






127 


49-66 


F LTG I VGNGL V I L VMG YQ 


658 


246 






128 


53-70 


I VGNGLVI LVMGYQKKLR 


650 


238 






129 


57-74 


GLVILVMGYQKKLRSMTD 


569 


157 






130 


61-78 


LVMGYQKKLR SMTDKYRL 


517 


105 






131 


65-82 


YQKKLRSMTDKYRLHLSV 


511 


99 






132 


69-86 


LRSMTDKYRLHLSVADLL 


572 


160 






133 


73-90 


TDKYRLHLS VADLLFVI T 


504 


92 






134 


77-94 


RLHLSVADLLFVITLPFW 


548 


136 






135 


81-98 


SVADLLFVITLPFWAVDA 


665 


253 






136 


85-102 


LLFVITLPFWAVDAVANW 


475 


63 






137 


89-106 


I T L P F WAVDAVANWY FGN 


542 


130 






138 


93-110 


FWAVDAVANWYFGNFLCK 


478 


66 






139 


97-114 


DAVANWYFGNFLCKAVHV 


524 


112 






140 


101-118 


NWYFGNFLCKAVHVIYTV 


508 


96 






141 


105-122 


GNFLCKAVHVIYTVNLYS 


643 


231 






142 


109-126 


C KAVHV I YT VNL Y S S VL I 


655 


243 






143 


113-130 


HVI YTVNL YS S VL I LAF I 


530 


118 






144 


117-134 


TVNL YS S VL I LAF I S LDR 


654 


242 






145 


121-138 


YS S VL I LAF I S LDRYLAI 


569 


157 






146 


125-142 


L I LAF I SLDRYLAI VHAT 


519 


107 






147 


129-146 


F I SLDRYLAI VHATNSQR 


503 


91 






148 


133-150 


DRYLAIVHATNSQRPRKL 


580 


168 






149 


137-154 


AIVHATNSQRPRKLLAEK 


485 


73 






150 


141-158 


ATNSQRPRKLLAEKWYV 


490 


78 






151 


145-162 


QRPRKLLAEKWYVGVWI 


539 


127 






152 


149-166 


KLLAE KWYVGVW I PALL 


501 


89 






153 


153-170 


E KWYVGVW I PALLLT I P 


559 


147 






154 


157-174 


YVGVW I PALLLT I PD F I F 


536 


124 






155 


161-178 


W I PALLLT I PDF I FANVS 


594 


182 






156 


165-182 


LLLT I PDF I FANVS E ADD 


1418 


1006 


X 




157 


169-186 


I PD F I FANVS E ADDR Y I C 


850 


438 




X 


158 


173-190 


IFANVSEADDRYICDRFY 


679 


267 






159 


177-194 


VSEADDRYICDRFYPNDL 


569 


157 






160 


181-198 


DDRYI CDRFYPNDLWVW 


537 


125 






161 
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185-202 


ICDRFYPNDLWWFQFQ 


718 


306 






162 


189-206 


F YPNDLWVWFQFQH I MV 


828 


416 




X 


163 


193-210 


DLWVWFQFQH I MVGL I L 


834 


422 


X 




164 


197-214 


WFQFQHIMVGLILPGIV 


1001 


589 




X 


165 


201-218 


FQHIMVGLILPGIVILSC 


582 


170 






166 


205-222 


MVGLILPGIVILSCYCII 


579 


167 






167 


209-226 


ILPGIVILSCYCIIISKL 


604 


192 






168 


213-230 


IVILSCYCIIISKLSHSK 


689 


277 






169 


217-234 


SCYCI I I SKLSHSKGHQK 


671 


259 






170 


221-238 


I I I SKLSHSKGHQKRKAL 


569 


157 






171 


225-242 


KLSHS KGHQKRKALKTTV 


542 


130 






172 


229-246 


S KGHQKRKALKTTV I L I L 


552 


140 






173 


233-250 


QKRKALKTTVILILAFFA 


695 


283 






174 


237-254 


AL KTT V I L I LAF F ACWL P 


673 


261 






175 


241-258 


TV I L I LAF FACWL P YY I G 


735 


323 






176 


245-262 


ILAFFACWLPYYIGISID 


596 


184 






177 


249-266 


FACWLPYYIGISIDSFIL 


614 


202 






178 


253-270 


LPYYIGISIDSFILLEII 


851 


439 






179 


257-274 


IGISIDSFILLEIIKQGC 


1146 


734 




X 


180 


261-278 


IDSFILLEI IKQGCEFEN 


3884 


3472 


X 




181 


265-282 


I LLE I I KQGCE FENTVHK 


529 


117 






182 


269-286 


I I KQGCEFENTVHKW I S I 


518 


106 






183 


273-290 


GCEFENTVHKWISITEAL 


676 


264 






184 


277-294 


ENTVHKW I S I TE ALAFFH 


727 


315 






185 


281-298 


HKWISITEALAFFHCCLN 


575 


163 






186 


285-302 


SITEALAFFHCCLNPILY 


600 


188 






187 


289-306 


ALAFFHCCLNPILYAFLG 


593 


181 






188 


293-310 


FHCCLNP I L YAFLGAKFK 


535 


123 






189 


297-314 


LNP I L YAFLGAKFKTSAQ 


686 


274 






190 


301-318 


LYAFLGAKFKTSAQHALT 


568 


156 






191 


305-322 


LGAKFKTSAQHALTSVSR 


612 


200 






192 


309-326 


FKTSAQHALTSVSRGSSL 


585 


173 






193 


313-330 


AQHALTSVSRGS SLKI LS 


559 


147 






194 


317-334 


LTSVSRGSSLKILSKGKR 


595 


183 






195 


321-338 


SRGSSLKILSKGKRGGHS 


581 


169 






196 


325-342 


SLKI LS KGKRGGHS S VST 


697 


285 






197 


329-346 


LSKGKRGGHSSVSTESES 


597 


185 






198 


333-350 


KRGGHS S VSTES E S S S FH 


579 


167 






199 


337-352 


HSSVSTESESSSFHSS 


515 


103 






200 



These data indicate that, in addition to polypeptide 
sequences derived from positions 1-18 of the CXCR4 
receptor, the polypeptide sequences LLLTI PDF I FANVSEADD (SEQ 
5 ID NO: 16) (165-182), WFQFQHIMVGLILPGIV (SEQ ID NO: 17) 

(197-214), and IDSFILLEI IKQGCEFEN (SEQ ID NO: 18) (261-278) 
comprise multiple subsequences, which is capable of binding 
to HIV-1 envelope gpl20. 
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Example 3 

This example identifies segments of the STRL33 
co-receptor that bind with gpl20. 
5 The first column in the table below indicates the 

number of the amino acid in the wild- type STRL33 receptor. 
The second column explicitly identifies the peptide 
sequence. The third and fourth columns indicate the 
radioactive counts recorded in twenty minutes (i.e., the 

10 cpm x 20) after the background or non-specific counts had 
been subtracted. The fifth column contains an X in each 
row for which the listed polypeptide bound with high 
affinity to gpl20. The sixth and final column contains an 
X in each row wherein the listed sequence binds with 

15 substantial affinity but is weak in comparison to other 
samples, particularly adjacent samples. 
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Major Minor SEQ 

SEQ SEG PEPTIDE Activity Activity ID 

Peak Peak NO: 





empty (control) 


-34 


.5 


34 


.5 






1--18 


MAEHDYHEDYGFSSFNDS 


1178 


.5 


1320 


.5 


X 


201 


5--22 


DYHEDYGFS S FNDS SQEE 


3357 


.5 


3689 


.5 


X 


202 


9--2S 


DYGFSSFNDSSQEEHQAF 


8579 


.5 


8909 


.5 


X 


203 


13-30 


SSFNDSSQEEHQAFLQFS 


2689 


.5 


2757 


.5 


X 


204 


17-34 


DSSQEEHQAFLQFSKVFL 


869 


.5 


2152 


.5 


X 


205 


21-38 


EEHQAFLQFSKVFLPCMY 


2316 


.5 


1819 


.5 


X 


206 


25-42 


AFLQFSKVFLPCMYLWF 


1421 


.5 


1359 


.5 


X 


207 


29-46 


FSKVFLPCMYLVVFVCGL 


534 


.5 


633 


.5 




208 


33-50 


FLPCMYLWFVCGLVGNS 


605 


.5 


372 


. 5 




209 


37-54 


MYLWFVCGLVGNS LVLV 


168 


.5 


235 


. 5 




210 


41-58 


VFVCGLVGNSLVLVI S I F 


570 


.5 


284 


. 5 




211 


45-62 


GLVGNSLVLVI S I FYHKL 


164 


.5 


95 


.5 




212 


49-66 


NSLVLVI S I FYHKLQSLT 


1255 


.5 


1378 


. 5 


X 


213 


53-70 


LVI S I F YHKLQSLTDVFL 


1620 


.5 


1780 


. 5 


X 


214 


57-74 


I FYHKLQSLTDVFLVNLP 


1275 


.5 


1256 


. 5 


X 


215 


61-78 


KLQSLTDVFLVNLPLADL 


412 


.5 


348 


. 5 




216 


65-82 


LTDVFLVNL PLADLVFVC 


233 


.5 


336 


. 5 




217 


69-86 


FLVNLPLADLVFVCTLPF 


70 


.5 


51 


. 5 




218 


73-90 


LPLADLVFVCTLPFWAYA 


557 


.5 


960 


.5 


X 


219 


77-94 


D L VFVCT L P F WA YAG I HE 


1116 


.5 


1063 


.5 


X 


220 


81-98 


VCTL P FWAYAG I HE WVFG 


1819 


.5 


1754 


.5 


X 


221 


85-102 


P FWAYAG I HE WVFGQVMC 


7262 


.5 


7537 


. 5 


X 


222 


89-106 


YAGIHEWVFGQVMCKSLL 


5911 


.5 


6245 


. 5 


X 


223 


93-110 


HEWVFGQVMCKSLLGIYT 


3391 


.5 


3466 


.5 


X 


224 


97-114 


FGQVMCKSLLGIYTINFY 


1257 


.5 


1354 


.5 


X 


225 


101-118 


MCKSLLGIYTINFYTSML 


1505 


.5 


1283 


. 5 




226 


105-122 


LLGIYTINFYTSMLILTC 


499 


.5 


408 


. 5 




227 


109-126 


YT I NF YT SML I LTC I TVD 


351 


.5 


510 


. 5 




228 


113-130 


FYTSMLILTCITVDRFIV 


744 


.5 


907 


.5 




229 


117-134 


MLILTCITVDRFIVWKA 


298 


.5 


228 


. 5 




230 


121-138 


TC I TVDRF I WVKATKAY 


89 


.5 


346 


.5 




231 


125-142 


VDRFIWVKATKAYNQQA 


103 


.5 


53 


.5 




232 


129-146 


IWVKATKAYNQQAKRMT 


166 


.5 


43 


.5 




233 


133-150 


KAT KA YNQQ AKRMTWGKV 


701 


.5 


568 


. 5 




234 


137-154 


AYNQQAKRMTWGKVTSLL 


55 


.5 


4 


.5 




235 


141-158 


QAKRMTWGKVTS LL I WV I 


-71 


.5 


-31 


.5 




236 


145-162 


MTWGKVTSLLIWVISLLV 


-0 


.5 


-26 


.5 




237 


149-166 


KVTSLLIWVISLLVSLPQ 


-39 


.5 


-118 


.5 




238 


153-170 


LLIWVISLLVSLPQIIYG 


42 


.5 


75 


.5 




239 


157-174 


VISLLVSLPQI I YGNVFN 


-60 


.5 


-127 


.5 




240 


161-178 


LVSLPQI I YGNVFNLDKL 


91 


.5 


-15 


.5 




241 


165-182 


PQI I YGNVFNLD KL I CGY 


-18 


.5 


-37 


.5 




242 


169-186 


YGNVFNLDKLICGYHDEA 


-41 


.5 


-20 


.5 




243 


173-190 


FNLDKL I CGYHDEAI STV 


1072 


.5 


1078 


.5 


X 


244 


177-194 


KLICGYHDEAISTWLAT 


1363 


.5 


1604 


.5 


X 


245 



39 



181 


-198 


GYHDEAI STWLATQMTL 


754 


.5 


1181 


.5 




X 


246 


185 


-202 


EAI STWLATQMTLGFFL 


3973 


.5 


3745 


.5 


X 




247 


189 


-206 


TWLATQMTLGFFLPLLT 


2327 


.5 


2389 


. 5 




X 


248 


193 


-210 


ATQMTLGFFLPLLTMIVC 


2365 


.5 


2444 


.5 




X 


249 


197 


-214 


TLGFFLPLLTMIVCYSVI 


2387 


.5 


479 


.5 






250 


201 


-218 


FLPLLTMIVCYSVI IKTL 


1270 


.5 


1195 


.5 




X 


251 


205 


-222 


LTMIVCYSVI IKTLLHAG 


2787 


.5 


2654 


.5 


X 




252 


209 


-226 


VCYSVI I KTLLHAGGFQK 


1334 


.5 


1143 


.5 




X 


253 


213 


-230 


VI I KTLLHAGGFQKHRS L 


961 


. 5 


682 


.5 






254 


217 


-234 


TLLHAGGFQKHRSLKI I F 


1041 


. 5 


999 


.5 






255 


221 


-238 


AGGFQKHRSLKI I FLVMA 


340 


.5 


260 


.5 






256 


225 


-242 


QKHRSLKI I FLVMAVFLL 


810 


.5 


814 


.5 






257 


229 


-246 


SLKI I FLVMAVFLLTQMP 


612 


.5 


853 


. 5 






258 


233 


-250 


I FLVMAVFLLTQMPFNLM 


386 


.5 


772 


.5 






259 


237 


-254 


MAVFLLTQMPFNLMKFIR 


2263 


.5 


2842 


.5 


X 




260 


241 


-258 


LLTQMPFNLMKFIRSTHW 


2513 


.5 


3154 


.5 


X 




261 


245 


-262 


MPFNLMKFIRSTHWEYYA 


2171 


.5 


2182 


.5 




X 


262 


249 


-266 


LMKF I RSTHWE YYAMTS F 


934 


. 5 


949 


.5 






263 


253 


-270 


I RSTHWE YYAMTS FH YT I 


1571 


.5 


1807 


.5 




X 


264 


257 


-274 


HWEYYAMTSFHYTIMVTE 


2040 


.5 


3065 


.5 


X 




265 


261 


-278 


YAMTS FH YT I MVTEAI AY 


2688 


.5 


2359 


.5 




X 


266 


265 


-282 


SFHYTIMVTEAIAYLRAC 


761 


.5 


1033 


.5 






267 


269 


-286 


TIMVTEAIAYLRACLNPV 


140 


.5 


272 


.5 






268 


273 


-290 


TEAIAYLRACLNPVLYAF 


604 


.5 


480 


.5 






269 


277 


-294 


AYLRACLNPVLYAFVSLK 


1802 


.5 


1849 


.5 




X 


270 


281 


-298 


ACLNPVLYAFVSLKFRKN 


4173 


.5 


4515 


. 5 


X 




271 


285 


-302 


PVLYAFVSLKFRKNFWKL 


1859 


.5 


2147 


.5 




X 


272 


289 


-306 


AFVSLKFRKNFWKLVKDI 


808 


.5 


1040 


. 5 






273 


293 


-310 


LKFRKNFWKLVKDIGCLP 


92 0 


.5 


957 


.5 






274 


297 


-314 


KNFWKLVKDIGCLPYLGV 


143 


.5 


82 


.5 






275 


301 


-318 


KLVKDIGCLPYLGVSHQW 


-2 


.5 


27 


.5 






276 


305 


-322 


DIGCLPYLGVSHQWKSSE 


17 


.5 


78 


.5 






277 


309 


-326 


L P YLGVSHQWKS S EDNS K 


111 


. 5 


122 


.5 






278 


313 


-330 


GVSHQWKS S EDNS KTFSA 


208 


. 5 


306 


.5 






279 


317 


-334 


QWKSSEDNSKTFSASHNV 


464 


. 5 


533 


.5 






280 


321 


-338 


S EDNS KT FS AS HNVE ATS 


524 


.5 


434 


.5 






281 


325 


-342 


S KT F S AS HNVE AT S M FQL 


1524 


.5 


1239 


.5 


X 




282 



These data indicate that, in addition to polypeptide 
sequences derived from positions 9-26 of the STRL33 
receptor, the polypeptide sequences LVI S I FYHKLQSLTDVFL (SEQ 
5 ID NO: 19) (53-70), PFWAYAG I HEWVFGQVMC (SEQ ID NO: 20) (85- 
102), EAI STWLATQMTLGFFL (SEQ ID NO: 21) (185-202), 
LTMIVCYSVI IKTLLHAG (SEQ ID NO: 22) (205-222), 
MAVF LLTQM P FNLMKF I RS THW (SEQ ID NO: 23) (237-258), 
HWEYYAMTSFHYTIMVTE (SEQ ID NO: 24) (257-274) , 
10 ACLNPVLYAFVSLKFRKN (SEQ ID NO: 25) (281-2 98) and 



40 

SKTFSASHNVEATSMFQL (SEQ ID NO: 26) (325-342) comprise 
multiple subsequences, which is capable of binding to HIV-1 
envelope gpl20. 

5 Example 4 

This example identifies segments of the human CD4 
protein that bind with gpl20. 

The second column in the in the table below identifies 
the amino acid residue sequence of the polypeptide employed 

10 in the assay. The first column identifies the sequence 

coordinates of human CD 4 that have an identical amino acid 
sequence. The third column indicates the number of 
radioactive decays (i.e., counts) that were counted, which 
is indicative of the affinity of the synthetic polypeptide 

15 for the gpl20 protein. In the table below, polypeptides 
retaining more than 4,000 counts identify fragments that 
have a substantial capability to bind with gpl20. 
Polypeptides retaining more than 6,000 counts have more 
substantial binding affinity. Polypeptides retaining at 

20 least about 10,000 counts have a substantial and strong 
capacity to bind to gpl20. Of course, fragments 
corresponding to amino acid coordinates 101-121 and 106-126 
have a substantial, strong, and dominant capacity to bind 
to gpl20. 

25 SEQ ID NO: 



Bl 


( 1) 


1 


-21 


MNRGVPFRHLLLVLQLALLPA 


3587 


283 


CI 


( 2) 


6 


-26 


PFRHLLLVLQLALLPAATQGK 


4356 


284 


Dl 


( 3) 


11 


-31 


LLVLQLALLPAATQGKKWLG 


1785 


285 


El 


( 4) 


16 


-36 


LALLPAATQGKKWLGKKGDT 


1759 


286 


Fl 


( 5) 


21 


-41 


AATQGKKWLGKKGDTVELTC 


1562 


287 


Gl 


( 6) 


26 


-46 


KKWLGKKGDTVELTCTASQK 


1910 


288 


HI 


( 7) 


31 


-51 


GKKGDTVELTCTASQKKS IQF 


1831 


289 


A2 


( 8) 


36 


-56 


TVELTCTASQKKS IQFHWKNS 


1732 


290 


B2 


( 9) 


41 


-61 


CTASQKKS I QFHWKNSNQI KI 


1717 


291 


C2 


(10) 


46 


-66 


KKS I QFHWKNSNQI KI LGNQG 


2182 


292 


D2 


(11) 


51 


-71 


FHWKNSNQ I KI LGNQGS FLTK 


1835 


293 


E2 


(12) 


56 


-76 


SNQIKILGNQGSFLTKGPSKL 


1487 


294 


F2 


(13) 


61 


-81 


I LGNQGS FLTKGPSKLNDRAD 


1467 


295 


G2 


(14) 


66 


-86 


GSFLTKGPSKLNDRADSRRSL 


1844 


296 


H2 


(15) 


71 


-91 


KGPSKLNDRADSRRSLWDQGN 


1912 


297 


A3 


(16) 


76 


-96 


LNDRADSRRSLWDQGNFPLI I 


1753 


298 
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13 *3 
DO 


( 1 ' ) 


Q T 
O 1 


TOT 

-101 


UbKKoJ-iWJjyGJMr FJ_il lr\JNijJs.l 


O O O A 

2224 


O Q O 

2 y y 


L.3 


/ T Q \ 


0 6 


-10 6 


T T»7T^/*^OTVT'C "OT T T VTVTT VTUnCHT 

LiWJjyGINJr FljllJM>JijJN.lhjJJblJl 


*5 O A 

32 64 


-j O A 

3 00 


JJ J 


/I Q\ 

1 J- y ) 


Q T 

y 1 


T T T 
-111 


XMr Fill 1 xsJNJ J_» Jx 1 Ei JJoD 1 x 1C_E V 


1164 6 


*5 A T 
3 01 


XT' "3 
Ei 3 


f O A \ 

1 2 0 ) 


96 


T T £T 

-116 


TtATT V T CHT V T /^DT TT7H A VT? 

1 KNliKlhiJJblJl x lLb VhjJjyK.E 


O A "j O 

843 9 


"i A O 

3 02 


r j 


(21) 


TOT 
10 1 


TOT 

-121 


ifciDbiJi x lUbVbJjyK^hjVyijij 


/TOOT 

6803 


*j A "5 

3 03 


G3 


( 0 0 \ 

( 22 ) 


106 


T O 

- 126 


TYI CEVEDQKEEVQLLVFGIjT 


44965 


A A 

3 04 


T_T "j 

H3 


( 23 ) 


t 1 t 

111 


TOT 

- 13 1 


VEDQKEEVQLL1VFGI1TANSDT 


*T ^ O A O 

3624 9 


~i A C 

3 05 


7\ A 

A4 


/ O A \ 

( 24 ) 


11/: 
116 


- 13 6 


EEVQliljVr GlilANbDlHliLiyG 


T /I T O T 

14171 


3 06 


TD A 
B4 


(2b) 


TOT 
12 1 


- 14 1 


L V FGLjT AN b DTH L LQGQ S XjT Xj 


3683 


"j O O 

3 07 


C4 


( 26 ) 


T O /T 

126 


- 146 


TANSDTHLLQGQS IiTLiTLES P 


/" T T A 

6114 


"} o o 

3 08 


T-\ A 

D4 


(27) 


131 


-151 


THXjIiQGQS LTLiTLE S P PGS S P 


o cr rr o 

2552 


o o 

3 09 


E4 


/ 0 0 \ 


13 6 


- 156 


GQSIjTLTLESPPGSSPSVQCR 


1538 


310 


F4 


(29) 


14 1 


-161 


LTLESPPGSSPSVQCRSPRGK 


1476 


311 


G4 


(30) 


146 


-166 


PPGSSPSVQCRSPRGKNIQGG 


1496 


312 


TT vl 

H4 


(31) 


151 


-171 


PSVQCRSPRGKNIQGGKTLSV 


1400 


313 


A5 


( 32 ) 


156 


-176 


norinAT/TiTTAnnT/rrT otto/*^t t~«t 

RSPRGKN I QGGKTLSVSQLE L 


2066 


314 


B5 


( 33 ) 


161 


-181 


KNIQGGKTLSVSQLELQDSGT 


3 078 


315 


C5 


( 34 ) 


166 


-186 


/"tT/rriT fTTCAT T7> T nArnT.TrpAHM t 

GKTIjSVSQLELQDSGTWTCTV 


2618 


316 


to it 

D5 


(3b) 


171 


TOT 

-191 


VSQLjEIjQDSGTWTCTVLQNQK 


3 879 


317 


T-1 (— 

E5 


(36) 


176 


T O 

-196 


T AH CT"T'T»TTirt' 1 1| TT AMA VT/T rP PTV 

LQDSGTWTCTVLQNQKKVEFK 


2456 


318 


F5 


( 3 / ) 


TOT 
181 


O O T 

-2 01 


T WT CTV LQNQKKVE F K I D I W 


403 0 


319 


Gb 


( 3 0 ) 


18 6 


00^ 
-206 


T TT AXTAT/TA 7r>T?T/TT^TtnTT 7\ t?AT/' 

VliQNQKKV E F K I D I W liAFQK 


973 7 


O O A 

320 


t t r- 
Ho 


(39) 


TOT 
191 


O T T 

-211 


T/VTTT'T7VT'nTT7TTT HT7A17 7\ PCTT7 

KKVE F K I D 1 VVIiA F Q KAS S I V 


6313 


321 


A6 


/ X A \ 

( 40 ) 


TO/" 

196 


O T f 

- 216 


VTRTTTTTT 71 PAT/7\ f f» TT n/ WTT A 

KI D I VVLiAFQKAS S I VYKKEG 


3681 


322 


B6 


/ A T \ 

(41) 


O O T 

2 01 


O O T 

-221 


T TT TV T?AV7V A ATTTT/WrTtr'ATTTTTI 

VIjAFQKAS s I vykkegeqve f 


3566 


323 


L6 


/ /I 0 \ 

(42 ) 


0 r\ /T 

2 06 


00^" 
-226 


V7\ r'OTT7VT/T/T?APAT7PT7ArinT "A 

KAS S I VY KKE GEQ VE F S F PLA 


14347 


324 


D6 


(43 ) 


211 


-231 


VYKKEGEQVE FS FPLAFTVE K 


14740 


325 


E6 


( 44 ) 


O T /■"" 

216 


-236 


/^P/NTTppriTITNT TV THUIT TTHT/T m^»/~l/~1 

GEQVEFSFPIiAFTVEKLTGSG 


18549 


326 


F6 


t A C \ 

(45) 


O O T 

221 


O A T 

-241 


TIOT - »OT TV T"irpT TT - 1 jrT rtlA AAPT T.TT.TA 

FS FPLAFTVE KXjTGSGELWWQ 


9673 


327 


Go 


( 46 ) 


00/^ 
22 6 


O A f~ 

-246 


7\ T—l Mil rp jr -r rp /~1 O T— 1 T TVTT»T/~\ 7\ T~l P> T\ O 

A FTVE KXiTG SGE L WWQAE RAS 


3 992 


328 


T T /- 

H6 


(47) 


O *"> T 

231 


-251 


TTT rpi^l ri/-ipT T«TT«TO 7V T~l T-» Tt O O O T7PT.T 

KLTGSGELWwQAERAS S S KS W 


1878 


329 


A7 


( 48 ) 


23 6 


-256 


/—I -T— 1 T T»TTVT/"\ 7V T~l TJ TV O O O TT" O T»T T rp ppT 

GEIjWWQAERASSSKSwITFDL 


273 0 


33 0 


B7 


(49 ) 


241 


-2 61 


s~\ t\ 7\ /■*"* <^ t^t /^i t.t t m nrvT tthtt/tit t 

qaeras s S KSWI TFDLKNKEV 


2588 


331 


C7 


/ c a \ 

( 50 ) 


O A /T 

24 6 


-2 66 


OffT/fTiTTTiPnT T/IT T/PT7PT7T7PT7 

SSSKSWITFDLKNKEVSVKRV 


1761 


332 


D7 


( 51 ) 


251 


-271 


TiTTrpPPT T/XTT/PT 7PT 7 T/TIT TW AT\ nT7 

WITFDLKNKEVSVKRVTQDPK 


2126 


333 


hi / 


/ r 0 \ 

( 52 ) 


256 


-276 


T T/XTT/CT TAT 7 Vm 7rp AP PT/T OHfl Ai V 

LKNKEVSVKRVTQDPKLQMGK 


2288 


334 


F7 


( 53 ) 


O C T 

261 


-281 


T, TOT TT7T1T TUI AT>r(T7T y*\TVVT/"t T^T r\T TT 

VS VKRVTQD P KLQMGKKL PLH 


1848 


335 


G / 


t C A \ 

(54 ) 


«~j /■- 
2 6 6 


0 0 
-286 


T7TAPT*>T/T AMAVT/T OT T TT mT O/O 

VTQDPKXjQMGKKLPLHL»TLiPQ 


2 075 


33 6 


T_T ""7 

H / 


( bb ) 


2 /l 


O O T 

-2 91 


TyTT AMA'T/1/T OT T T T rpx nA7\ T O/OTiT 

KLQMGKKLi PLHIjTLiPQAL PQY 


194 9 


337 


Ao 


( b6 ) 


2 7 6 


O O £T 

-2 96 


jy T/" T TTT T T T rp T PA7V T PAT/7\ O AA*M 

KKL P LiHliT Ij PQAL PQ YAGS GN 


1922 


338 


0 0 

DO 


( b / ) 


O O T 

2 81 


O O T 

-301 


TT T TT OA7\T PAW7V O/^TVTT rp t 74 T 

H L1TL1 PQALi PQY AGS GNLT LAL 


23 94 


33 9 


Co 


( 58 ) 


2 8 6 


-3 06 


A\ T\ T PAV TV O A\TT rp t t\ -r y- 1 7\ T/rpA 

QAL PQ YAG S GN LT LjAL E AKTG 


2364 


340 


D8 


( 59 ) 


2 91 


-3 11 


YAGSGNLTLALEAKTGKLHQE 


1830 


341 


E8 


( 60 ) 


296 


-316 


ITT m T Ti T n Tt T^*m T/T TTrtTITTlTT T TT T 

NLTLALEAKTGKLHQEVNLW 


1676 


342 


F8 


( 61 ) 


A T 

3 01 


-321 


T t — i Tt T>r m y"l t^t t Tr\TiT t>tt t tt T*if t*> Tt m /~\. 

LEAKTGKLHQEVNLWMRATQ 


1729 


343 


G8 


( 62 ) 


3 06 


-326 


nt/T TTr\nTT\TT TTTnurr* * rnr\T ^^t^-xtt 

GKLHQEVNLWMRATQLQKNL 


1776 


344 


T T O 

H8 


(63 ) 


"5 T T 

311 


-331 


T-ITTTlTT TTTT»Jir> H mr\T /\T/>TT r-ri T — 1 T TT.T 

EVNLWMRATQLQKNLTCEVW 


2183 


345 


A9 


( 64 ) 


316 


-33 6 


VMRATQLQKNLTCEVwGPTS P 


2144 


346 


"O O 

B9 


( 65 ) 


O T 

3 21 


3 41 


AT AVXTT rPAPTTT.TArilTlfinT/T Ik/IT O 

QLQKNLTCEVWGPTS PKLMLS 


1856 


347 


C9 


(66) 


326 


3 4 6 


LTCEVWGPTSPKLMLSLKLEN 


2412 


348 


D9 


(67) 


331 


-351 


WGPTSPKLMLSLKLENKEAKV 


2414 


349 


E9 


(68) 


336 


-356 


PKLML S LKLENKE AKVS KRE K 


1656 


350 


F9 


(69) 


341 


-361 


S LKLENKE AKVS KRE KAVWVL 


1663 


351 


G9 


(70) 


346 


-366 


NKEAKVSKREKAVWVLNPEAG 


1735 


352 


H9 


(71) 


351 


-371 


VSKREKAVWVLNPEAGMWQCL 


2034 


353 


A10 


(72) 


356 


-376 


KAVWVLNPEAGMWQCLLSDSG 


3133 


354 
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BIO (73 ) 


361 


-3 81 


LiNPkAGMWyLJjLbDbGQVLjLjE 


6316 


3 55 


L10 ( 74 ; 


3 66 


-3 86 


GIVIWyCljijbUbCjQVIjLiESNI KV 


4185 


->{—/- 
356 


D10 ( 75 ) 


371 


DOT 

-3 91 


IiLi b JJ b Gy V LiLtli b W 1 K V J_j P 1 W b 


23 75 


O C *"7 

357 


t*±V v /d ; 


"3 *7 C 
3/6 


- j y d 


PAT7T T C 1 C M T T/"T 7T "D TTa7 O T 1 T3T 7/^ T3 


o n o o 

2 uo y 


"3 C O 


rlU ( / / ; 


"3 O T 


/inn 
- 4 U 1 


tt> o XT T VT TT DTTaT C T DUH DM A T T\7 

JibJM 1 JvVJ_iFl Wb I FvyirIYLA.ljl V 


1 Q Q O 

iy y2 


O C Q 

5 b y 


LrlU \ /o ) 


~) O £T 


-406 


T TT T~j T'TaT C? ""P TIT 7/^ T3M 7V T TT7T 

V J_iFl Wb 1 FVyFIVlAijl V LiLtCjVA 


O 1 G *"7 

2 iy / 


*3 rv 
3 6 0 


rilU ^ / y / 


"3 O 1 

3 y i 


vl 1 T 

-411 


CTTJITADMAT TT7T T T "C 

b 1 P VvJ FMAIjI VijCjVjVALjijijJ_ir 


O C O T 

2 b2 / 


"3 £T T 
3 61 


ATI / O A \ 

All (80) 


"3 O ^ 

3 y 6 




T31VVI7V T TT7T PPTTTXHT T T TT' T f~~* T Z^" 1 X 

PMAIjI VijGGVAGLIjLir IGIjGI 


*3 r\ *7 

3 067 


"3 C O 

3 62 


nil / o i \ 
Bll lol) 


4 01 


-421 


T TT /*i /""IT 7 7\ T T T "dXOT PT DDPTTD 


*3 ^7 "3 O 

3738 


"3 /T *3 

363 


Cll ( 82 ) 


4 0 6 


/I ^ y^" 

-42 6 


7\/^T T T TT> T /~1 T X TT* TT" t~*\ TT1 n T T T1 T"i 

AGLiLiLiF IGLiGIF FCVRCRHRR 


»~\ r\ c\ c\ 

2 099 


364 


"nil / p"j \ 

D XX \ O J ) 


4 X X 


ft j± 


T?Tr"2T nT T? 'C , r T \7T?r ,, D T-TDT?P Pi 

r J. Vjjijkjj _L r r v JK^K.rlK.i\.l\.yAiLi\. 


x y u u 


job 


Ell (84) 


416 


-436 


I FFCVRCRHRRRQAERMSQ I K 


2085 


366 


Fll (85) 


421 


-441 


RCRHRRRQAERMSQIKRLLSE 


2075 


367 


Gil (86) 


426 


-446 


RRQAERMSQIKRLLSEKKTCQ 


1607 


368 


Hll (87) 


431 


-451 


RMSQI KRLLSEKKTCQCPHRF 


2020 


369 


A12 (88) 


436 


-456 


KRLLSEKKTCQCPHRFQKTCS 


1674 


370 


B12 (89) 


441 


-458 


EKKTCQCPHRFQKTCSPI 


2006 


371 


Al ( 0) 






empty (control) 


2075 





Example 5 

This example shows the binding of 125 I -HIV- Ilat. gpl2 0 to 
5 the amino termini of CCR5 , CXCR4 , and STRL33 as a function 
of the dependence on position and length. Synthetic 
peptide arrays of nonapeptides, dodecapeptides , 
pentadecapeptides and octadecapeptides derived from CCR5 
(panel A) , CXCR4 (panel B) and STRL33 (panel C) amino 

10 terminal domains were prepared and utilized to test the 
binding of 125 I -HIV- Ilai envelope gpl20. Ordinal sequence 
position numbers are given in accordance with the sequence 
data provided by the Genbank database for CCR5 (accession 
No. gl457946 r gi | 1457946) , CXCR4 (accession No. g539677, 

15 gi|400654 / sp|P30991) and STRL33 (accession No. g2209288, 
gi | 2209288). The counts shown are the counts detected in 
each well minus the background counts (i.e., counts 
observed in the assay when no polypeptide was bound to the 
well of the 96-well assay plate) . 



20 



43 



Panel A 

CCR5 

Initial 
Sequence 
# 



Peptide Sequence 
Scanning Windows 

(In each sequence row 
9-, 12-, 15-, 18-mers 
share the same initial 
starting point.) 



Binding Results for Window Length 

(counts bound - background 
(no peptide) ) 



xxxxxxxxx 9 
xxxxxxxxxxxx 12 
xxxxxxxxxxxxxxx 15 
xxxxxxxxxxxxxxxxxx 



12 



15 



SEQ 

ID 

NO: 



18 



18 



1 


MDYQVSSPIYDINYYTSE 


543 


2682 


4976 


5880 


372 


2 


DYQVSSPIYDINYYTSEP 


1552 


3089 


5401 


6363 


373 


3 


YQVSSPIYDINYYTSEPC 


2533 


5305 


5415 


6119 


374 


4 


QVSSPIYDINYYTSEPCQ 


490 


1959 


4594 


5645 


375 


5 


VSSPIYDINYYTSEPCQK 


509 


1629 


3280 


3521 


376 


6 


SSPIYDINYYTSEPCQKI 


671 


1739 


3498 


3285 


377 


7 


SPIYDINYYTSEPCQKIN 


1503 


3463 


4575 


3234 


378 


8 


PIYDINYYTSEPCQKINV 


1186 


2285 


2682 


2036 


379 


9 


I YD I NYYTS E PCQKI NVK 


1359 


2702 


2516 


1261 


380 


10 


YD I NY YT SEP CQK I NVKQ 


4379 


5245 


3052 


1913 


381 


11 


D I NYYT S E PCQKI NVKQ I 


1396 


1361 


1144 


712 


382 


12 


I NYYTS E PCQKI NVKQ I A 


1384 


1190 


707 


684 


383 


13 


NYYTS E PCQKI NVKQ I AA 


1548 


977 


760 


595 


384 


14 


YYTS E PCQKI NVKQ I AAR 


1029 


1052 


847 


638 


385 


15 


YTSEPCQKINVKQIA 


567 


507 


459 




386 


16 


TSEPCQKINVKQIAA 


440 


427 


509 




387 


17 


SE PCQKI NVKQ I AAR 


434 


430 


426 




388 


18 


EPCQKINVKQIA 


397 


432 






389 


19 


PCQKINVKQIAA 


386 


385 






390 


20 


CQKINVKQIAAR 


435 


581 






391 


21 


QKINVKQIA 


453 








392 


22 


KINVKQIAA 


487 








393 


23 


I NVKQ I AAR 


474 








394 
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ranei b 


Peptide Sequence Scanning 


Binding 


Results 


For 


Window 


Length 




Windows 




















v councs 


DOUIIQ — 


background) 






(In each sequence row 


9-, 












Initial 


12-, 15-, 18-mers share 












Secruence 


the same initial 














# 


starting point.) 
















xxxxxxxxx 9 




9 








SEQ 










12 






ID 




xxxxxxxxxxxxxxx 15 








15 




NO: 




xxxxxxxxxxxxxxxxxx 


18 








18 




1 


MEGI S I YTSDNYTEEMGS 




591 


334 


3275 


2079 


395 


2 


EGI S I YTSDNYTEEMGSG 




a 


886 


7255 


1548 


396 


3 


GI S I YTSDNYTEEMGSGD 




454 


2644 


3274 


1217 


397 


4 


I S I YTSDNYTEEMGSGDY 




466 


3973 


2202 


861 


398 


5 


S I YT S DNYTE EMG S GD YD 




a 


288 


168 


239 


399 


6 


I YTSDNYTEEMGSGD YDS 




332 


335 


195 


173 


400 


7 


YT S DNYTE EMG SGD YD SM 




181 


161 


201 


103 


401 


8 


TSDNYTEEMGSGDYDSMK 




a 


54 


119 


38 


402 


9 


SDNYTEEMGSGDYDSMKE 




151 


149 


124 


161 


403 


10 


DNYTE EMGSGDYDSMKEP 




67 


121 


57 


102 


404 


11 


NYTEEMGSGDYDSMKEPC 




a 


100 


30 


134 


405 


12 


YTEEMGSGDYDSMKEPCF 




68 


213 


70 


103 


406 


13 


TEEMGSGDYDSMKEPCFR 




146 


67 


23 


47 


407 


14 


EEMGSGDYDSMKEPCFRE 




a 


61 


121 


13 0 


408 


15 


EMGSGDYDSMKEPCFREE 




64 


36 


69 


64 


409 


16 


MGSGDYDSMKEPCFREEN 




57 


68 


64 


129 


410 


17 


GSGDYDSMKEPCFREENA 




a 


155 


172 


155 


411 


18 


SGDYDSMKEPCFREENAN 




100 


118 


186 


89 


412 


19 


/~»T\"\/"r"\ OAATyTT 1 nPUD ■CTT I "KT7V "NT T7" 




53 


167 


198 


134 


413 


20 


DYDSMKE PCFREENANFN 




a 


167 


146 


75 


414 


21 


YDSMKEPCFREENANFNK 




171 


144 


80 


89 


415 


22 


DSMKEPCFREENANFNKI 




85 


144 


146 


40 


416 


23 


SMKEPCFREENANFN 




a 


119 


55 




417 


24 


MKEPCFREENANFNK 




188 


133 


74 




418 


25 


KEPCFREENANFNKI 




165 


105 


93 




419 


26 


EPCFREENANFN 




a 


69 






420 


27 


PCFREENANFNK 




104 


108 






421 


28 


CFREENANFNKI 




103 


66 






422 


29 

a Not done 


REENANFNK 




58 








423 



45 



Panel C 


Peptide Sequence 


Binding Results 


For 


Window 


Length 




Scanning Windows 












STRL33 




(counts 


bound - 


background) 






(In each sequence 












Initial 


row 9-, 12-, 15-, 












Sequence 


18-mers share the 












# 


same initial 
starting point.) 














xxxxxxxxx 9 


Q 
















12 






ID 




xxxxxxxxxxxxxxx 15 






15 




NO: 




xxxxxxxxxxxxxxxxxx 18 








18 




1 


MAEHDYHEDYGFSSFNDS 


160 


625 


1239 


1386 


424 


2 


AEHDYHEDYGFS S FNDSS 


354 


697 


1095 


1014 


425 


3 


EHDYHEDYGFSSFNDSSQ 


509 


937 


2235 


1219 


426 


4 


HDYHEDYGFSSFNDSSQE 


708 


1427 


1772 


1500 


427 


5 


D YHEDYGFS S FND SSQEE 


851 


1554 


1240 


1191 


428 


6 


YHEDYGFSSFNDSSQEEH 


728 


1950 


1357 


985 


429 


7 


HEDYGFSSFNDSSQEEHQ 


729 


1077 


947 


537 


430 


8 


EDYGFSSFNDSSQEEHQA 


953 


817 


1152 


548 


431 


9 


DYGFSS FNDS SQE EHQAF 


701 


573 


595 


440 


432 


10 


YGFSSFNDSSQEEHQAFL 


345 


745 


645 


1138 


433 


11 


GFSSFNDSSQEEHQAFLQ 


171 


480 


270 


1639 


434 


12 


FSSFNDSSQEEHQAFLQF 


249 


403 


361 


3608 


435 


13 


SSFNDSSQEEHQAFLQFS 


243 


277 


902 


6038 


436 


14 


SFNDSSQEEHQAFLQFSK 


304 


303 


969 


4537 


437 


15 


FNDSSQEEHQAFLQFSKV 


246 


470 


4089 


4678 


438 


16 


NDSSQEEHQAFLQFS 


180 


497 


6160 




439 


17 


DSSQEEHQAFLQFSK 


147 


882 


4588 




440 


18 


SSQEEHQAFLQFSKV 


287 


4455 


4732 




441 


19 


SQEEHQAFLQFS 


647 


7512 






442 


20 


QEEHQAFLQFSK 


1109 


5672 






443 


21 


EEHQAFLQFSKV 


6060 


5598 






444 


22 


EHQAFLQFS 


7505 








445 


23 


HQAFLQFSK 


2761 








446 


24 


QAFLQFSKV 


2600 








447 



Example 6 

This example shows 125 I -HIV- 1^ gpl2 0 binding to 
5 N- terminal peptide variants of CCR5, CXCR4 and STRL33 . 

Octadecapeptide alanine replacement variants of 
maximum gpl20 binding activity peaks were synthesized and 
tested for 125 I -HIV- luvi gpl20 binding. Each binding value 
presented is the average of two separate synthesis and 



binding experiments. Relative percentage of Control = 
{[(mean counts/Control counts)] x 100%} ± average 
deviation. Background counts (no peptide, see Example 7) 
were subtracted from all values. Data for CCR5 are 
presented in Panel A; data for CXCR4 are presented in Panel 
B; and data for STRL3 3 are presented in Panel C. 

Panel A. 125 I -HIV- 1^ gpl20 binding to N- terminal peptide 

variants of CCR5 

CCR5 variant peptides Relative % of SEQ ID 
(1-18) Control a NO: 



Control 


MDYQVS S P I YD INY YTS E 


100 






448 


MIA 


AD YQ VS S P I YD I NY YTS E 


167 


± 


4 ? 


449 


D2A 


MAYQVS S P I YD I NY YT S E 


125 


± 


8 


450 


Y3A 


MDAQVS S P I YD I NY YTS E 


51 


± 


2 


451 


Q4A 


MD YAVS S P I YD I NY YT S E 


104 


± 


7 


452 


V5A 


MD YQAS S P I YD I NY YT S E 


82 


± 


3 


453 


S6A 


MDYQVAS P I YD I NY YT S E 


124 


± 


3 


454 


S7A 


MD YQ VS AP I YD I NY YT S E 


56 


± 


2 


455 


P8A 


MDYQVS S A I YD I NY YT S E 


157 


± 


2 


456 


I9A 


MDYQVS S P AYD I NY YT S E 


24 


± 


7 


457 


Y10A 


MDYQVS S P I AD I NY YT S E 


19 


± 


6 


458 


D11A 


MDYQVS S P I YA I NY YT S E 


63 


± 


22 


459 


I12A 


MDYQVS S P I YDANY YT S E 


14 


+ 


1 


460 


N13A 


MDYQVS S P I YD I AY YTS E 


253 


± 


19 


461 


Y14A 


MDYQVS S P I YD I NAYT S E 


15 


± 


0.3 


462 


Y15A 


MDYQVS S P I YD I NY ATS E 


21 


± 


5 


463 


T16A 


MDYQVS S P I YD I NY YAS E 


78 


± 


34 


464 


S17A 


MDYQVS S P I YD I NY YT AE 


64 


± 


6 


465 


E18A 


MDYQVS S P I YD I NY YTS A 


4 


± 


2 


466 



a The percent binding for the wild- type peptide was 
defined as 100%. 
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Panel B 125 I -HIV- 1^ gpl20 binding to N-terminal peptide 
variants of CXCR4 



CXCR4 variant peptides Relative % of SEQ ID 
(1-18) Control a NO: 



Control 


MEGISIYTSDNYTEEMGS 


100 








467 


MIA 


AEGIS I YTSDNYTEEMGS 


118 


± 


18 




4 68 


E2A 


MAG I S I YTSDNYTEEMGS 


36 


± 


0 . 


3 


469 


G3A 


MEAT S I YTSDNYTEEMGS 


101 


± 


3 




4 70 


I4A 


MEGAS I YTSDNYTEEMGS 


6 


± 


0 . 


3 


4 71 


S5A 


MEGI AT YTSDNYTEEMGS 

1 ILlVJXni X X JUiM X X l—i J—J 1 J\JiJ 


133 


± 


5 




4 72 


T £ A 


1 v 1.CiVjj_L Oax X OUiNl X 1 CjIM v 1LjO 


o 

z, 


+ 


i 
j_ 




fi / O 


Y7A 


MEGISIATSDNYTEEMGS 


7 


± 


0. 


4 


474 


T8A 


MEGISIYASDNYTEEMGS 


97 


± 


10 




475 


S9A 


MEGI SI YTADNYTEEMGS 


70 


± 


4 




476 


D10A 


MEGISIYTSANYTEEMGS 


71 


+ 


8 




477 


Nil A 


MEGI S I YTSDAYTEEMGS 


38 


± 


0. 


4 


478 


Y12A 


MEGI S I YTSDNATEEMGS 


28 


± 


2 




479 


T13A 


MEGI S I YTSDNYAEEMGS 


70 


± 


6 




480 


E14A 


MEGI S I YTSDNYTAEMGS 


72 


± 


1 




481 


E15A 


MEGISIYTSDNYTEAMGS 


56 


± 


7 




482 


M16A 


MEGISIYTSDNYTEEAGS 


88 


± 


4 




483 


G17A 


MEGISIYTSDNYTEEMAS 


68 


± 


8 




484 


S18A 


MEGISIYTSDNYTEEMGA 


79 


± 


1 




485 



a The percent binding for the wild-type peptide was 
defined as 100%. 
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Panel C 125 I -HIV- 1^ gpl20 binding to N-terminal 
peptide variants of STRL33 



STRL33 variant Relative % of SEQ ID 
peptides (21-38) Control a NO: 



Control 


EEHOAFLiOFSKVFLPCMY 

X_l J — i x X K^yirx x U j. k — / J* v v x J — J x. \_ i i j. 


100 








486 


E21A 


AEHOAFLOFSKVFLPCMY 


81 


± 


2 




487 


E22A 


EAHOAFLOFSKVFLPCMY 

uni i^ixi jujv^/ x u ± v v x j — i x v-»i i x 


70 


± 


1 




488 


H23A 


EEAOAFLOFSKVFLPCMY 

x_i J — ifiyni j— x iw> x ^ v x xj x Vwi i x. 


99 


+ 


1 




489 


Q24A 


EEHAAFLOFSKVFLPCMY 


72 


± 


1 




490 


A2 5 A 


EEHOAFLOFSKVFLPCMY 


101 


± 


1 




491 


F2 6A 


EEHOAALOFSKVFLPCMY 

xj xjix x > H /Jj**.i-i' H > x k_j xv v x xj x n — - x i j. 


32 


± 


0 . 


1 


492 


L2 7A 


E EHO AF AO F S KVF L P CM Y 

X-l XJIX l^fll * A St XV V X X_l X V — 1 1 X. 


37 


± 


2 




493 


Q2 8A 


EEHQAFLAFSKVFLPCMY 


44 


± 


0. 


4 


494 


F2 9A 


EEHQAFLQASKVFLPCMY 


20 


± 


1 




495 


S3 OA 


EEHQAFLQFAKVFLPCMY 


92 


± 


2 




496 


K31A 


EEHQAFLQFSAVFLPCMY 


162 


± 


2 




497 


V32A 


EEHQAFLQFSKAFLPCMY 


51 


± 


3 




498 


F33A 


EEHQAFLQFSKVALPCMY 


45 


± 


2 




499 


L.34A 


EEHQAFLQFSKVFAPCMY 


76 


± 


1 




500 


P3 5A 


EEHQAFLQFSKVFLACMY 


82 


± 


3 




501 


C3 6A 


EEHQAFLQFSKVFLPAMY 


53 


± 


5 




502 


M37A 


EEHQAFLQFSKVFLPCAY 


112 


± 


4 




503 


Y3 8A 


EEHQAFLQFSKVFLPCMA 


83 


± 


2 




504 


a The 


percent binding for the 


wild- 


type 


peptide 


was 



defined as 100%. 



Example 7 

This example demonstrates that the binding of HIV-1 
5 gpl20 envelope protein to the polypeptides of the present 
invention and to the chemokine receptors from which the 
present inventive polypeptides were originally derived or 
inspired is conserved across the various species of HIV-1. 
This example also demonstrates that a step subsequent to 

10 initial binding of gpl20 to CCR5 , CXCR4 , STRL33, and CD4 is 
the most likely source of the phenomenon of host -range 
selectivity. Additionally, this example demonstrates that 
the underlying method is accurate in that receptor variants 
that are predicted to have an altered affinity for binding 

15 with gpl20, do in fact have a statistically similar 

alteration in affinity where comparable changes in the 
receptors have been identified in other work and the 
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affinity for binding of gpl20/effect on infectivity has 
been measured. 

This example examines the effect of particular 
mutations of CCR5 that were studied in the work underlying 
5 the present invention and that were also studied by other 
artisans in the field. 

The following table identifies a mutation in the first 
column. The first letter designates the wild- type amino 
acid present at the position indicated by the number, and 

10 the letter A which terminates all entries in the first 
column indicates that the amino acid residue present in 
that position in the mutant polypeptide is alaninyl. For 
example, the first data row (i.e., the second row of the 
table) contains the entry Y3A in the first column, which 

15 indicates that the tyrosine residue at position 3 of the 
wild-type CCR5 is substituted by an alanine residue. 

The second column provides the percentage of binding 
exhibited by a mutant polypeptide compared to a wild-type 
polypeptide, when the methods used to elucidate the present 

20 invention are used in conjunction with radiolabeled HIV-Ilai 
gpl2 0 envelope protein. The third through seventh columns 
provide similar data that have been extracted from the work 
of others in the field using a strain of HIV-1 virus 
indicated at the top of each column. For example, row 2 of 

25 the following table indicates that when the mutation Y3A is 
effected in the human CCR5 chemokine receptor, then the 
resulting CCR5 polypeptide has 51.4% of the ability to bind 
HIV-Ila! gpl2 0 envelope protein in comparison to an 
equivalent wild- type peptide. Similarly, HIV-Iada binds to 

30 the mutant polypeptide with 79% of the affinity of a 
non-mutated CCR5 chemokine receptor. 
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gpl20 


YU2 


ADA 


JF-RL 


89.6 


DH123 


Y3A 


51 . 4 


n/a 


79 


82 


n/a 


42 


Q4A 


104 


85 


132 


111 


67 


105 


Y10A 


19.2 


2 


50 


26 


10 


3 


D11A 


62 . 8 


2 


27 


22 


6 


3 


Y14A 


14 . 6 


12 


47 


25 


6 


0 


Y15A 


21 


30 


3 


3 


1 


0 


E18A 


4 . 1 


45 


12 


12 


3 


10 



Statistical analysis of these data indicates that the 
similarity between the binding affinity of each mutant 
5 peptide for gpl2 0 elucidated in this study is not more than 
about 25% likely to be causally unrelated to the effects 
observed for YU2 , and not more than about 4% likely to be 
causally unrelated to the effects observed for each of the 
other viruses listed in the table above. 

10 Additionally, the affinity measurements generated by 

the underlying technique has been demonstrated to be 
accurate by (repetitively) showing that antibodies that 
specifically bind to radiolabeled gpl20 are capable of 
preventing the binding of gpl2 0 to polypeptides that have 

15 shown high affinity for binding with gpl2 0 in the 

experiments upon which the present invention is predicated. 
Thus, this example shows that the binding with chemokine 
receptors HIV-1 can be inhibited by the present inventive 
polypeptides, irrespective of the strain of HIV-1 from 

20 which the gpl20 protein is obtained. 

Example 8 

This example provides a characterization of the 
critical amino acids in the amino- terminal segments of 
25 CCR5, CXCR4, and STRL33 that are essential for the ability 
of these polypeptides to bind with gpl20. 

In this example, the effect on binding that occurs to 
due successive replacement of each amino acid with alanine 
is indicated, wherein a ( + ) signifies a decrease in binding 
30 affinity and a (>) signifies an enhancement in binding 
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affinity. As is clear from inspection, the sequences are 
shown with that amino- terminus at top and the carboxyl- 
terminus at bottom. 



CCR5 (1-18) 


CXCR4 (1-18) 


STRL33 (21-38) 


M> 


M 


E 


D 


E + 


E 


Y+ + 


G 


H 


Q 


I+++++ 


Q 


V 


S> 


A 


S 


I++++++ 


F + + + 


S + 


Y+++++ 


L+ + 


P> 


T 


Q+ 


I+ + + 


S + 


F+ + + 


Y+ + + 


D+ 


S 


D+ 


N+ + 


K> 


I + + + + 


Y+ + 


V+ 


N> 


T 


F+ 


Y++ + + 


E 


L 


Y+ + + 


E + + 


P 


T 


M 


C+ 


S + 


G 


M 


E+++++ 


S 


Y ! 



Example 9 

This example employs the same technique as Example 4 
and provides information similar to that available from 
Example 4 . 

10 The data below compares the ability of synthetic 

fragments of CD4 to bind to labeled gp!20. 9-mer, 12-mer, 
15-mer, 18-mer, and 21-mers were selected based on the data 
from Examples 4 . The relative binding affinities of each 
group of polypeptides can be determined by inspection of 

15 the number of counts of radiolabeled gpl20 that were 

retained by each N-mer. Data supporting these conclusions 
are provided by Examples 10 and 11. 
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Peptide 




9pl20 


SEQ 


Peptide 




Gpl20 


SEQ 


Btaitiny 


7\ r** A trA D^t-\ t~ "i i-Q c-T 

MCLlVc reptlUcS 


DOUQa 


ID 


bLdiL my 


Active Peptides 


Bound 


ID 


t-\(~\ Q -5 <- i An 4+ 
pusiuiuu ft 




\ C_t_JU.il l_ o / 


NO • 


JJUbiLiUIl ft 






\xr\ . 
vi\J . 




ACTIVE 9-MERS 








ACTIVE 12-MERS 






105 


DTYICEVED 


1043 


505 


101 


I EDSDTY I CEVE 


1107 


530 


115 


KEEVQLLVF 


1273 


506 


112 


EDQ KEEVQLLVF 


1379 


531 


1 1 c 
lie 


EiEi V v r VJ 


Jl / u 


3 \J / 


1 1 -1 
X x _> 




X o Z ft 






uynT .T ,VPflT ■ 

Hi V ^ J_l J_J V r VjjJ_l 




cno 

DUO 


114 


n K"P P VOT . T .V POT . 


X / O D 


el *a *a 
j j j 










11C 
X X 3 


VPCTTT^T T \7T?(~1T .TP 
IvCj & V S^XjXj V r VjXj 1 


1 11 A 
±.11*% 












116 


EEVQLLVFGLTA 


3261 


535 










117 


EVQLLVFGLTAN 


1838 


536 










133 


LLQGQSLTLTLE 


1320 


537 


217 


ay vor or Jtr j_» 


i mo 

1UJ-. 


D w _7 


91 ^ 


Cjuoy v iii r or tr L±e\ 


Xfx 3 O 


3 .3 0 


218 


\^ V Ei E or ir J_Lrt 


12 0 5 


J X V 


216 


rjpnvppQPPT.ap 

UlJ^ V ElE Or ±T i_Lrti7 


X / 4b J? 


^ "3 Q 


219 


VFF^FPT.AP 

V Hi 17 L-J ±7 XT JJnf 


1064 


3 X X 


217 


POVPPQPPT.APT 
£iy vr.r or iriinr x 


XjjO 


3 ft v 










218 


Vw/ v Ej e or r Ltrtr x v 


X D O D 


C41 
3 *± X 




fW*. A -L V i_j -J- -w) 1 lul\0 








-TiV-> X X V Ei X O ricjlxO 






109 


CEVEDQ KEEVQLLVF 


1729 


512 


105 


DTYI CEVEDQKEEVQLLV 


1648 


542 


110 


EVEDQKEEVQLLVFG 


2805 


513 


106 


T Y I CEVEDQKEEVQLLVF 


3794 


543 


111 


VEDQKEEVQLLVFGL 


3816 


514 


107 


YI CEVEDQKEEVQLLVFG 


4611 


544 


112 


EDQKEEVQLLVFGLT 


3633 


515 


108 


I CEVEDQ KEEVQLLVFGL 


3898 


545 


113 


DQKEEVQLLVFGLTA 


3905 


516 


109 


CEVEDQKEEVQLLVFGLT 


3797 


546 


114 


QKEEVQLLVFGLTAN 


3770 


517 


110 


EVEDQKEEVQLLVFGLTA 


3647 


547 


115 


KEEVQLLVFGLTANS 


3485 


518 


111 


VEDQKE EVQLLVFGLTAN 


3913 


548 


116 


EEVQLLVFGLTANSD 


6423 


519 


112 


EDQKEEVQLLVFGLTANS 


3416 


549 


117 


EVQLLVFGLTANSDT 


2689 


520 


113 


DQKEEVQLLVFGLTANSD 


3317 


550 










114 


QKE EVQLLVFGLTANSDT 


3671 


551 


130 


DTHLLQGQSLTLTLE 


1622 


521 


127 


ANS DTHLLQGQSLTLTLE 


1540 


552 


131 


THLLQGQSLTLTLES 


1874 


522 


128 


NSDTHLLQGQSLTLTLES 


1726 


553 


132 


HLLQGQSLTLTLESP 


1277 


523 


129 


SDTHLLQGQSLTLTLESP 


1260 


554 


213 


KKEGEQVEFSFPLAF 


1921 


524 


210 


I V YKKEGEQVE FS FPLAF 


5382 


555 


214 


KEGEQVEFSFPLAFT 


3253 


525 


211 


VYKKEGEQVEFSFPLAFT 


4307 


556 


215 


EGEQVEFSFPLAFTV 


3270 


526 


212 


Y KKEGE QVE F S F P LAFTV 


483 9 


557 


216 


GEOVEFSFPLAFTVE 


4656 


527 


213 


KKPOPOVPPQPPT.APTVP 
r\.r*JEi\j Ei*j v Hi r o r riinr x v Ej 


1DO J 


CEO 
3 3 O 


217 


V D r O r XT J_Lrt.J7 1 V D x\. 


413 5 


52 8 


214 


Y P OP n\7P P Q P P T . A PTVP \C 
I\Ei\jEj\*/ V Ej E or c i-irW X V Ei x\. 


•1117 
J X X / 


c. c. q 

3 3-7 


218 


QVEFSFPLAFTVEKL 


2047 


529 


215 


EGEQVE FS FPLAFTVE KL 


2164 


560 










216 


GEQVE FS F P LAFTVE KLT 


1643 


561 




ACTIVE 21-MERS 














90 


GNFPLIIKNLKIEDSDTYICE 


5248 


562 










91 


NFPLI IKNLKIEDSDTYICEV 


7803 


563 










92 


FPLII KNLKI EDSDTY I CEVE 


13919 


564 











53 



93 


PLI I KNLKI EDSDTYI CEVED 


20145 


565 


94 


LI I KNLKI EDSDTYI CEVEDQ 


17108 


566 


95 


I I KNLKI EDSDTYI CEVEDQK 


11892 


567 


96 


I KNLKI EDSDTYI CEVEDQKE 


15073 


568 


97 


KNLKI EDSDTYI CEVEDQ KEE 


8789 


569 


99 


LKI EDSDTY I CEVEDQKE EVQ 


5519 


570 


100 


KI EDSDTYI CEVEDQKEEVQL 


6325 


571 


101 


I EDSDTYI CEVEDQKEEVQLL 


12064 


572 


102 


EDSDTYI CEVEDQ KEEVQLLV 


4933 


573 


103 


DSDTY I CEVEDQ KEEVQLLVF 


30277 


574 


104 


SDT Y I CEVEDQKEEVQLLVFG 


30319 


575 


105 


DTY I CEVEDQ KEE VQLLVFGL 


25424 


576 


106 


TYI CEVEDQKE EVQLLVFGLT 


20191 


577 


107 


Y I CEVEDQKE EVQLLVFGLTA 


22884 


578 


108 


I CEVEDQKEEVQLLVFGLTAN 


7276 


579 


109 


CEVEDQ KEEVQLLVFGLTANS 


3517 


580 


123 


FGLTANSDTHLLQGQSLTLTL 


11529 


581 


124 


GLTANSDTHLLQGQSLTLTLE 


14065 


582 


125 


LTANSDTHLLQGQSLTLTLES 


17113 


583 


126 


TANSDTHLLQGQSLTLTLESP 


23595 


584 


204 


FQKASSIVYKKEGEQVEFSFP 


9382 


585 


205 


QKASSIVYKKEGEQVEFSFPL 


24959 


586 


206 


KASSIVYKKEGEQVEFSFPLA 


30873 


587 


207 


AS S I VYKKEGEQVE FSF PLAF 


25146 


588 


208 


SS I VYKKEGEQVE FSFPLAFT 


28068 


589 


209 


SIVYKKEGEQVEFSFPLAFTV 


8165 


590 


210 


I VYKKEGEQVE FSF PLAF TVE 


15620 


591 


221 


FSFPLAFTVEKLTGSGELWWQ 


4163 


592 


222 


SFPLAFTVEKLTGSGELWWQA 


2284 


593 


223 


F PLAF TVE KLTGSGELWWQAE 


6276 


594 


224 


PLAFTVEKLTGSGELWWQAER 


2647 


595 


225 


LAFTVE KLTGSGELWWQAE RA 


3577 


596 



Example 10 

This example provides data which enables those skilled 
in the art to arrive at the conclusions indicated in 
5 Examples 9 and 12. In this example, the counts of 

radiolabeled gp-120 retained by each peptide indicated in 
the left hand column are given in the right hand column. 
The first panel (panel A) provides data for 21-mers of CD4 . 



Panel A 
PEPTIDE 



LWDQGNFPL I I KNLKI EDSDT 

WDQGNFPLIIKNLKIEDSDTY 

DQGNFPL I I KNLKI EDSDTYI 

QGNFPLI IKNLKIEDSDTYIC 

GNFPLI I KNLKI EDSDTYI CE 

NFPLII KNLKI EDSDTYI CEV 

FPLIIKNLKIEDSDTYICEVE 

PLI IKNLKIEDSDTYICEVED 

L I I KNLKI EDSDTYI CEVEDQ 

I I KNLKI EDSDT Y I CE VEDQK 

I KNLKI EDSDTYI CEVEDQKE 

KNLKIEDSDTYICEVEDQKEE 

NLKIEDSDTYICEVEDQKEEV 

LKI EDSDTYI CEVEDQKEEVQ 

KI EDSDT YI CEVEDQKEEVQL 

I EDSDT Y I CEVEDQKEE VQLL 

EDSDTYI CEVEDQ KEEVQLLV 

DSDTYICEVEDQKEEVQLLVF 

SDTYICEVEDQKEEVQLLVFG 

DTYICEVEDQKEEVQLLVFGL 

T Y I CEVEDQKEE VQLLVFGLT 

YI CEVEDQKEEVQLLVFGLTA 

I CEVEDQKEEVQLLVFGLTAN 

CEVEDQKEEVQLLVFGLTANS 

EVEDQKEEVQLLVFGLTANSD 

VEDQKEEVQLLVFGLTANSDT 

EDQKEEVQLLVFGLTANSDTH 

DQKEEVQLLVFGLTANSDTHL 

QKEEVQLLVFGLTANSDTHLL 

KEEVQLLVFGLTANSDTHLLQ 

EEVQLLVFGLTANSDTHLLQG 

EVQLLVFGLTANSDTHLLQGQ 

VQLLVFGLTANSDTHLLQGQS 

QLLVFGLTANSDTHLLQGQSL 

LLVFGLTANSDTHLLQGQSLT 

LVFGLTANSDTHLLQGQSLTL 

VFGLTANSDTHLLQGQSLTLT 

FGLTANSDTHLLQGQSLTLTL 

GLTANSDTHLLQGQSLTLTLE 

LTANSDTHLLQGQSLTLTLES 

TANSDTHLLQGQSLTLTLESP 

Empty (Control) 

TWTCTVLQNQKKVEFKIDIW 

WTC T VLQNQ KKVE F K I D I WL 

TCTVLQNQKKVEFKIDIWLA 



COUNTS SEQ ID 
NO: 





S 97 


Q Q Q 
O O -7 


CQO 

J70 


I 1 *)Q 

II J O 


CQQ 
3 -7 27 


/L. *± ^£ 


0 u u 


J^l O 


£0 1 

O U X 


7 ft 0 ~k 
1 0 \j 0 


COO 


1 "501 Q 
X ~> ZP X -7 


O VJ .5 


9 0 1 a ^ 

^ U Xfi D 


D U ft 


1 7 1 O ft 


£ 0 r 

DUD 


1 1 ft Q9 

X X O -7 ^ 


£ 0 £ 

DUO 


1^0 7** 

1 JU / J 


D \J / 


ft 7 ft Q 


£ O ft 
DUO 


9 m £ 


£ A Q 
DU-7 


CC1 Q 
J D 1 -? 


D X VJ 


O O ^ 3 


1 

D X X 


19 0 £4. 

JL U D *x 


£1 9 

D1Z 


4. Q7 7 


O X -3 


O 9 7 7 
J> \J z. / 1 


CI / 
O X *± 


-J U J 1 I? 


O X 3 


9 R4 94. 


£ 1 £ 
OXO 


9 n 1 q 1 


£ 1 7 
D X / 


9 9 ft ft A 
^ *i O O r± 


£ 1 ft 
OXO 




£ 1 Q 


j ji / 


£ 9 n 

D VJ 


1 £ ft 7 

X O O / 


£ 9 1 


£4. £ 

O TC O 


£ 9 9 
O ^ 


R £ 9 


COT 
D 


J -7 -7 


£94 




DZ J 


G ft 9 


£9 


con 

O _7 


£ 9 7 


Rft 9 


£9 ft 
0 0 


_L VJ -7 -7 


COQ 

O ^ -7 


9 0 R7 

^ U J / 


0 -3 u 


obU 




4677 


632 


2762 


633 


11529 


634 


14065 


635 


17113 


636 


23595 


637 


515 




1430 


638 


1616 


639 


1092 


640 



55 



L- 1 VJjQJNQKKVEr i\lUl V VliAr 


o q r\ Q 

zyuy 


d41 


I VljQJNJQKrvVrjr r\.lDl V VliA.r Q 


*a o ^7 *3 




VLQJNQKKVhjr K1D1 V V LAr QK 


T -3 O O 


o4 3 


T OrVTT"\ Vin TTP TT 1 V X T\ T T TT TT A DAV A 

LQJNIQKKVrjr KllJl V VLAr QKA 


T O C C 

12 DO 


C A A 

64 4 


/^VT/^T^T^^ TT^ TPT/* X "FN X T TT TT A "nAV A O 

QJNQKKVEr KlUlVVijAf QKAb 


T O A O 

1 0O0 


/T /I C 

64 b 


MAVVl TTPTPT/'X T^ TT TT 7T ADnVACC 

JNJQKKVEr K1D1 V VLAr yKAhb 


1507 


64 6 


QKKVEr K1D1 VVLAr ^KAbbl 


T r a 

759 


64 7 


KKVEr KID1 VVLAr QKAbblV 


*"7 O O 

7 82 


64 8 


T/"T TT-l TT1 Ty X 7"*V T T 7T TT 7V C X T 

KVEFKI D 1 VVLAFQKAS S 1 VY 


63 5 


64 9 


T rrirtT/ T T*\ X T TT TT "A TT»/"\T/"A P P TT T~\T1S 

VEFKIDI WLiAFQKAS SI VYK 


725 


6 50 


br KlDIVVLAr yKAbblVYKK 


64 9 


651 


T~i TV" ~T r> X T TT TT 7\ TT'/ r ^vTy 7\ C O T t rvvt/Ti 

FKI D I VVLAFQKAS S I VYKKE 


593 


6 52 


T/TTiTinrT 7\ DAT/A O C TTrVT/T/"CA 

K1U1 VVLAr yKAbblVYKKbCj 


13 94 


/- r a 

6 53 


ID1 VVLAr QKAbbI VYKKEGE 


c\ tr 0 

962 


654 


T^v X T TT TT A DAT/7k P C T^rVT/T/CA'PA 

UlVVLAr QKAbbI VYKKbGbQ 


1-700 
78 8 


/— r— r~ 

655 


XTTTTT A DAT/A CCTT7VT/"I/'DAT?ATT 

1 V VLAr Q KAb blVY KKbGbQ V 


64 6 


656 


TTTTT A T?AL r A C C TTr\TVVT?r'rj'ATTT? 

VVLAr yKAbbl VYKKE GbQVb 


772 


657 


V J_i/\r QJ\/\bbl V Y rvrvbLrbQ vbr 


1 ion 

1/93 


r r 0 

6bo 


LAr yKAbbl VYKKEGEQVEFS 


1 /l T A 

14 1 0 


/- r a 

659 


A UA VA O O T T TV V" T/* TJ 1 TP /^\T 717" Tp O TP 

Ah y J\Ab b 1 V I KKbbhy V hi? br 


3 / /b 


r r A 

660 


"Z?rW7\. a C T T rVTyTyTP/"" 1 TP/^T TT7 1 TP O TP T) 

r yisAbbl VYKKE bbyVhr brF 


93 82 


6 61 


Ay 7\ o O X T 7~\7Z/'TS'T?/~ i TJ»^T TTP TP O TP T">T 


24 959 


662 


T/A OOTTTVlTT/DADATTDDCDnT A 

KAbbl VYKKE LrEQVEr br PLiA 


3 0873 


6 63 


A C C T^rvVT/DADATTDPC'D'nT TV. "O 

Abb ± V YKKbGbQVEr br PLAr 


2 514 6 


6 64 


tP fP X T 7T.7" TV TV TT* /~1 TT» /"NT TTTI T~i O t-i T"\T "A T~i r-rn 

SSI VY KKEGEQVEF S FPLiAFT 


28068 


665 


SIVYKKEGEQVEFSFPLAFTV 


8165 


666 


T T TVT/Vn /"I T"t/"\T TT~~l T~t O I - 1 T~ST TV T" imT TT~l 

I VY KKEGEQVEF S F PLAFTVE 


15620 


667 


VY KKE GE Q VE F S F P I_iA F T V E K 


242 9 


668 


T7"IVTVrT«i^"'»T7' r\\ TT" »TT »0 TT»T"VT TV 1 — I rT"TT TT~ITVT" 

Y KKE GEQVEFSFP LAF T VE KL 


735 


669 


KKEGEQVE F S F PLAFTVE KLT 


1847 


670 


KEGEQVEFSFPLAFTVEKLTG 


972 


671 


bljbQVbr br PLAr 1 VbKLIGb 


73 9 


672 


f~* TP f'VT TTP TP O TP T~> T A TP T'T TTP T/" T r P/^«C/ r ^ 

LrbQVbr br FliAr 1 VbrvLlbbb 


if r 0 

6 52 


6 73 


TP (~\\ TTP TP OT7DT A TP'X'T TTP T/"T Xi/^O/^TP 

by Vbr br PLAr 1 VEKL lGbGE 


>-7 zt rr 

76 5 


>T "-7 /1 

6 74 


r\\ TTTI TP O TP T) T A L ir PT T T?T/T rp/" 1 C? TP T 

QVEr br PLAr 1 VEKLlGbGEL 


/4 1 


n r 

675 


T ttt 1 TP O TP T) T A TPT'TTTPVT TP O ^TPT TaT 


6 3 3 


r n /- 

6/6 


UDCCDT TV TT lr T , T TTPT/'T TnPPDT T.TT.T 

br br PLAr 1 VEKLlGbGELWW 


^T 0 1 

681 


^* •— 7 >— 7 
677 


TP C TP T~>T A TPT^T TTP XfT HPA npDT T»7T»7A 

rbrPLAr 1 VEKLlGbGELWWQ 


4163 


678 


SFPLAFTVEKLTGSGELWWQA 


2284 


679 


FPLAFTVEKLTGSGELWWQAE 


6276 


680 


PLAFTVEKLTGSGELWWQAER 


2647 


681 


LAFTVEKLTGSGELWWQAERA 


3577 


682 


AFTVEKLTGSGELWWQAERAS 


1739 


683 


Empty (control) 


617 





56 



These second and third panels (panels B and C) provide data 
for 18-mers of a small region of CD4 . 

Panel B 



PEPTIDE 


COUNTS 


SEQ ID NO: 


LWDQGNF PL 1 1 KNL K 


502 


684 


WDOGNFPLIIKNLKI 


534 


685 


DQGNFPLI IKNLKIE 


635 


686 


QGNFPLI I KNLKI ED 


509 


687 


GNFPLIIKNLKIEDS 


624 


688 


NFPL I I KNLKI EDSD 


654 


689 


FPLI IKNLKIEDSDT 


539 


690 


PL I I KNLKI EDSDTY 


661 


691 


LI I KNLKI EDSDTYI 


542 


692 


I IKNLKIEDSDTYIC 


664 


693 


I KNLKI EDSDTY I CE 


568 


694 


KNLKI EDSDTYI CEV 


562 


695 


NLKIEDSDTYICEVE 


1160 


696 


LKI EDSDTYI CEVED 


846 


697 


KIEDSDTYICEVEDQ 


1088 


698 


I EDSDTYI CEVEDQK 


1143 


699 


EDSDTY I CEVEDQKE 


815 


700 


DSDTYICEVEDQKEE 


973 


701 


SDTYICEVEDQKEEV 


993 


702 


DTYICEVEDQKEEVQ 


1071 


703 


TYICEVEDQKEEVQL 


956 


704 


YICEVEDQKEEVQLL 


1064 


705 


I CEVEDQKEEVQLLV 


1084 


706 


CEVEDQKEEVQLLVF 


1729 


707 


EVEDQKEEVQLLVFG 


2805 


708 


VEDQKEEVQLLVFGL 


3816 


709 


EDQKEEVQLLVFGLT 


3633 


710 


DQKEEVQLLVFGLTA 


3905 


711 


QKEEVQLLVFGLTAN 


3770 


712 


KEEVQLLVFGLTANS 


3485 


713 


EEVQLLVFGLTANSD 


6423 


714 


EVQLLVFGLTANSDT 


2689 


715 


VQLLVFGLTANSDTH 


1006 


716 


QLLVFGLTANSDTHL 


865 


717 


LLVFGLTANSDTHLL 


599 


718 


LVFGLTANSDTHLLQ 


609 


719 


VFGLTANSDTHLLQG 


532 


720 


FGLTANSDTHLLQGQ 


625 


721 


GLTANSDTHLLQGQS 


532 


722 


LTANSDTHLLQGQSL 


634 


723 


TANSDTHLLQGQSLT 


513 


724 


ANSDTHLLQGQSLTL 


542 


725 









NSDTHLLQGQSLTLT 


631 


726 


SDTHLLQGQSLTLTL 


747 


727 


DTHLLQGQSLTLTLE 


1622 


728 


THLLQGQSLTLTLES 


1874 


729 


HLLQGQSLTLTLESP 


1277 


730 


LWDQGNF PL 1 1 KNLKI ED 


582 


731 


WDQGNFPLI IKNLKIEDS 


626 


732 


DQGNFPLI IKNLKIEDSD 


598 


733 


QGNFPLI IKNLKIEDSDT 


564 


734 


GNFPLI I KNLKI EDSDTY 


557 


735 


NFPLI IKNLKIEDSDTYI 


627 


736 


FPLIIKNLKIEDSDTYIC 


509 


737 


PLI IKNLKIEDSDTYICE 


624 


738 


LI IKNLKIEDSDTYICEV 


634 


739 


I I KNLKI EDSDTY I CEVE 


751 


740 


I KNLKI EDSDTY I CE VED 


699 


741 


KNLKI EDSDTY I CEVEDQ 


708 


742 


NLKIEDSDTYICEVEDQK 


863 


743 


LKI EDSDTYI CEVEDQKE 


872 


744 


KIEDSDTYICEVEDQKEE 


858 


745 


I EDSDTYI CEVEDQKEEV 


1230 


746 


EDSDTYI CEVEDQKEEVQ 


788 


747 


DSDTY I CEVEDQ KEEVQL 


961 


748 


SDTYICEVEDQKEEVQLL 


870 


749 


DTYICEVEDQKEEVQLLV 


1648 


750 


TYI CEVEDQKEEVQLLVF 


3794 


751 


Y I CEVEDQKE EVQLLVFG 


4611 


752 


I CEVEDQKEEVQLLVFGL 


3898 


753 


CEVEDQKEEVQLLVFGLT 


3797 


754 


EVEDQKEEVQLLVFGLTA 


3647 


755 


VEDQKEEVQLLVFGLTAN 


3913 


756 


EDQKEEVQLLVFGLTANS 


3416 


757 


DQKEEVQLLVFGLTANSD 


3317 


758 


QKEEVQLLVFGLTANSDT 


3671 


759 


KEEVQLLVFGLTANSDTH 


1271 


760 


EEVQLLVFGLTANSDTHL 


783 


761 


EVQLLVFGLTANSDTHLL 


667 


762 


VQLLVFGLTANSDTHLLQ 


673 


763 


QLLVFGLTANSDTHLLQG 


574 


764 


LLVFGLTANSDTHLLQGQ 


568 


765 


LVFGLTANSDTHLLQGQS 


564 


766 


VFGLTANSDTHLLQGQSL 


531 


767 


FGLTANSDTHLLQGQSLT 


591 


768 


GLTANSDTHLLQGQSLTL 


572 


769 


LTANSDTHLLQGQSLTLT 


528 


770 


TANSDTHLLQGQSLTLTL 


891 


771 


ANSDTHLLQGQSLTLTLE 


1540 


772 


NSDTHLLQGQSLTLTLES 


1726 


773 





58 




SDTHLLQGQSLTLTLESP 


1260 


77 


Empty (control) 


575 




Panel C 






PEPTIDE 


COUNTS SEQ 


ID NO: 


WTCTVLQNQKKVEFK 


566 


775 


TCTVLQNQKKVEFKI 


510 


776 


CTVLQNQKKVEFKID 


608 


111 


T VLQNQ KKVE F K I D I 


587 


118 


VLQNQKKVEFKIDIV 


605 


119 


LQNQKKVEFKIDIW 


644 


780 


QNQ KKVE F K I D I WL 


636 


781 


NQKKVEFKIDIWLA 


860 


782 


QKKVEFKIDIWLAF 


1333 


783 


KKVEFKIDIWLAFQ 


951 


784 


KVEFKIDIWLAFQK 


1051 


785 


VEFKIDIWLAFQKA 


1005 


786 


EFKIDIWLAFQKAS 


1188 


787 


FKIDIWLAFQKASS 


1001 


788 


KIDIWLAFQKASSI 


956 


789 


I D I WLAFQKAS S I V 


865 


790 


D I WLAFQKAS S I VY 


776 


791 


I WLAFQKAS S IVY K 


783 


792 


WLAFQKAS S I VYKK 


577 


793 


VLAFQKASSIVYKKE 


634 


794 


LAFQKASSIVYKKEG 


593 


795 


AFQKASSIVYKKEGE 


544 


796 


FQKASSIVYKKEGEQ 


637 


797 


QKASSIVYKKEGEQV 


519 


798 


KASSIVYKKEGEQVE 


563 


799 


ASSIVYKKEGEQVEF 


589 


800 


S S I VYKKEGEQ VE F S 


558 


■ 801 


S I VYKKEGEQVEFSF 


651 


802 


IVYKKEGEQVEFSFP 


615 


803 


VYKKEGEQVEFSFPL 


714 


804 


YKKEGEQVEFSFPLA 


687 


805 


KKEGEQVE FS F PLAF 


1921 


806 


KEGEQVEFSFPLAFT 


3253 


807 


EGEQVEFSFPLAFTV 


3270 


808 


GEQVEFSFPLAFTVE 


4656 


809 


EQVEFSFPLAFTVEK 


4135 


810 


QVEFSFPLAFTVEKL 


2047 


811 


VE F S F P LA F T VE KL T 


899 


812 


EFSFPLAFTVEKLTG 


920 


813 


FSFPLAFTVEKLTGS 


672 


814 


SFPLAFTVEKLTGSG 


565 


815 









FPLAFTVEKLTGSGE 


556 


816 


PLAFTVEKLTGSGEL 


612 


817 


LAFTVEKLTGSGELW 


579 


818 


AFTVEKLTGSGELWW 


586 


819 


FTVEKLTGSGELWWQ 


625 


820 


TVEKLTGSGELWWQA 


550 


821 


VEKLTGSGELWWQAE 


735 


822 


EKLTGSGELWWQAER 


683 


823 


WTCTVLQNQKKVE FKI D I 


588 


824 


TCTVLQNQKKVE FK I D I V 


571 


825 


CTVLQNQKKVEFKIDIW 


553 


826 


TVLQNQKKVE FKI D I WL 


655 


827 


VLQNQKKVE FK I D I WLA 


724 


828 


LQNQKKVE FKI D I WLAF 


938 


829 


QNQ KKVE F K I D I WLAFQ 


917 


830 


NQKKVEFKIDIWLAFQK 


889 


831 


QKKVEFKIDIWLAFQKA 


1013 


832 


KKVE F KI D I WLAFQKAS 


912 


833 


KVEFKIDIWLAFQKASS 


1011 


834 


VEFKIDIWLAFQKASSI 


819 


835 


EFKIDIWLAFQKASSIV 


799 


836 


FKI D I WLAFQKAS S I VY 


843 


837 


KIDI WLAFQKAS SI VYK 


779 


838 


I D I WLAFQ KAS S I VYKK 


711 


839 


D I WLAFQKAS S I VYKKE 


660 


840 


I WLAFQKAS S I VYKKEG 


531 


841 


WLAFQKAS S I VYKKEGE 


560 


842 


VLAFQKAS S I VYKKEGEQ 


549 


843 


LAFQKASSIVYKKEGEQV 


665 


844 


AFQKASSIVYKKEGEQVE 


514 


845 


FQKAS S I VYKKEGEQVE F 


528 


846 


QKAS S I VYKKEGEQVEFS 


602 


847 


KASSIVYKKEGEQVEFSF 


536 


848 


ASS I VYKKEGEQVEFS FP 


701 


849 


SSIVYKKEGEQVEFSFPL 


756 


850 


SIVYKKEGEQVEFSFPLA 


771 


851 


IVYKKEGEQVEFSFPLAF 


5382 


852 


VYKKEGEQVEFSFPLAFT 


4307 


853 


YKKEGEQVEFSFPLAFTV 


4839 


854 


KKEGEQVE F S F P LAF T VE 


4683 


855 


KEGEQVEFSFPLAFTVEK 


3117 


856 


EGEQVEFSFPLAFTVEKL 


2164 


857 


GEQVEFSFPLAFTVEKLT 


1643 


858 


EQVEFSFPLAFTVEKLTG 


798 


859 


QVEFSFPLAFTVEKLTGS 


736 


860 


VEFSFPLAFTVEKLTGSG 


533 


861 


EFSFPLAFTVEKLTGSGE 


668 


862 


FSFPLAFTVEKLTGSGEL 


613 


863 



60 



SF PLAFTVEKLTGSGELiW 


t~ r~ r~ 

ODD 


864 


FPLAFTVEKLTGSGELWW 


586 


865 


PLAFTVEKLTGSGELWWQ 


650 


866 


LAFTVEKLTGSGELWWQA 


866 


867 


AFTVEKLTGSGELWWQAE 


788 


868 


FTVEKLTGSGELWWQAER 


1143 


869 


Empty (control) 


556 





The fourth and fifth panels (Panels D and E) provide data 
for select 9-mers and 12-mers of CD4 . 

5 Panel D 



PEPTTDR 


POT TNT 9 








NO : 


DQGNFPLII 


662 


870 


QGNFPLIIK 


508 


871 


GNFPLIIKN 


600 


872 


NFPLIIKNL 


561 


873 


FPLIIKNLK 


601 


874 


PLIIKNLKI 


697 


875 


LIIKNLKIE 


515 


876 


I IKNLKIED 


658 


877 


IKNLKIEDS 


557 


878 


KNLKIEDSD 


612 


879 


NLKIEDSDT 


512 


880 


LKIEDSDTY 


492 


881 


KIEDSDTYI 


603 


882 


IEDSDTYIC 


567 


883 

U U J 


EDSDTYICE 


650 


884 


DSDTYICEV 


712 


885 


SDTYICEVE 


819 


886 


DTYICEVED 


1043 


887 


TYICEVEDQ 


805 


888 


YICEVEDQK 


728 


889 


ICEVEDQKE 


596 


890 


CEVEDQKEE 


555 


891 


EVEDQKEEV 


587 


892 


VEDQKEEVQ 


521 


893 


EDQKEEVQL 


564 


894 


DQKEEVQLL 


589 


895 


QKEEVQLLV 


636 


896 


KEEVQLLVF 


1273 


897 


EEVQLLVFG 


3170 


898 


EVQLLVFGL 


2146 


899 


VQLLVFGLT 


815 


900 


QLLVFGLTA 


822 


901 


LLVFGLTAN 


576 


902 



61 



LVFGLTANS 

VFGLTANSD 

FGLTANSDT 

GLTANSDTH 

LTANSDTHL 

TANSDTHLL 

ANSDTHLLQ 

NSDTHLLQG 

SDTHLLQGQ 

DTHLLQGQS 

THLLQGQSL 

HLLQGQSLT 

LLQGQSLTL 

LQGQSLTLT 

QGQSLTLTL 

GQSLTLTLE 

DQGNFPLI I KNL 

QGNFPLIIKNLK 

GNFPLI IKNLKI 

NFPLI IKNLKIE 

FPLI IKNLKIED 

PLI IKNLKI EDS 

LIIKNLKIEDSD 

IIKNLKIEDSDT 

I KNLKIEDSDTY 

KNLKIEDSDTYI 

NLKI EDSDTYI C 

LKIEDSDTYICE 

KI EDSDTYI CEV 

I EDSDTYI CEVE 

EDSDTYI CEVED 

DSDT Y I CEVEDQ 

SDTYI CEVEDQK 

DTYI CEVEDQKE 

TYICEVEDQKEE 

YICEVEDQKEEV 

I CEVEDQKEEVQ 

CEVEDQKEEVQL 

EVEDQKEEVQLL 

VEDQKEEVQLLV 

EDQKEEVQLLVF 

DQKEEVQLLVFG 

QKEEVQLLVFGL 

KEEVQLLVFGLT 

EEVQLLVFGLTA 

EVQLLVFGLTAN 

VQLLVFGLTANS 

QLLVFGLTANSD 



r n ^ 

52 2 


903 


54 9 


904 


rr ~i 

5 63 


905 


A CS 1 

4 81 


906 


r~ r\ s- 

596 


907 


554 


908 


642 


r\ r\ c\ 

909 


561 


910 


526 


911 


578 


912 


512 


913 


564 


914 


568 


915 


501 


916 


594 


917 


111 


918 


604 


919 


533 


920 


547 


921 


647 


C\ o o 

922 


511 


923 


565 


924 


619 


925 


511 


926 


574 


927 


523 


928 


639 


929 


635 


930 


601 


931 


1107 


932 


956 


933 


93 7 


934 


84 6 


935 


72 0 


936 


818 


937 


734 


f\ *"i o 

93 8 


585 


93 9 


561 


940 


508 


941 


657 


942 


1379 


943 


1624 


944 


1785 


945 


1774 


946 


3261 


947 


1838 


948 


747 


949 


721 


950 



62 



LLVFGLTANSDT 53 3 951 

LVFGLTANSDTH 586 952 

VFGLTANSDTHL 54 8 953 

FGLTANSDTHLL 571 954 

GLTANSDTHLLQ 574 955 

LTANSDTHLLQG 534 956 

TANSDTHLLQGQ 54 9 957 

ANSDTHLLQGQS 55 9 958 

NSDTHLLQGQSL 585 959 

SDTHLLQGQSLT 54 0 96 0 

DTHLLQGQSLTL 52 7 961 

THLLQGQSLTLT 64 6 962 

HLLQGQSLTLTL 701 963 

LLQGQSLTLTLE 132 0 964 

Empty (control) 581 



Panel E 



PEPTIDE 


COUNTS 


SEQ ID 






NO: 


TVLQNQKKV 


534 


965 


VLQNQKKVE 


556 


966 


LQNQKKVEF 


565 


967 


QNQKKVEFK 


537 


968 


NQKKVEFKI 


597 


969 


QKKVEFKID 


575 


970 


KKVEFKIDI 


501 


971 


KVEFKIDIV 


555 


972 


VEFKIDIW 


548 


973 


EFKIDIWL 


665 


974 


FKIDIWLA 


568 


975 


KIDIWLAF 


665 


976 


IDIWLAFQ 


691 


977 


DIWLAFQK 


686 


978 


IWLAFQKA 


602 


979 


WLAFQKAS 


600 


980 


VLAFQKASS 


466 


981 


LAFQKASSI 


592 


982 


AFQKASSIV 


595 


983 


FQKASSIVY 


568 


984 


QKASSIVYK 


494 


985 


KASSIVYKK 


498 


986 


ASSIVYKKE 


600 


987 


SSIVYKKEG 


515 


988 


SIVYKKEGE 


566 


989 


IVYKKEGEQ 


534 


990 


VYKKEGEQV 


490 


991 


YKKEGEQVE 


518 


992 
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KKEGEQVEF 

KEGEQVEFS 

EGEQVEFSF 

GEQVEFSFP 

EQVEFSFPL 

QVEFSFPLA 

VEFSFPLAF 

EFSFPLAFT 

FSFPLAFTV 

SFPLAFTVE 

FPLAFTVEK 

PLAFTVEKL 

LAFTVEKLT 

AFTVEKLTG 

FTVEKLTGS 

TVEKLTGSG 

VEKLTGSGE 

EKLTGSGEL 

KLTGSGELW 

LTGSGELWW 

TGSGELWWQ 

TVLQNQKKVEFK 

VLQNQKKVEFKI 

LQNQKKVE FKI D 

QNQKKVEFKI D I 

NQKKVEFKIDIV 

QKKVEFKIDIW 

KKVEFKIDIWL 

KVEFKIDI WLA 

VEFKIDI WLAF 

EFKIDI WLAFQ 

FKIDIWLAFQK 

KIDI WLAFQKA 

IDIWLAFQKAS 

DIWLAFQKASS 

I WLAFQKAS S I 

WLAFQKASSIV 

VLAFQKAS S I VY 

LAFQKASSIVYK 

AFQKASSIVYKK 

FQKAS S I VYKKE 

QKASSIVYKKEG 

KASSIVYKKEGE 

ASS I VYKKEGEQ 

SSIVYKKEGEQV 

SI VYKKEGEQ VE 

I VYKKEGEQVE F 

VYKKEGEQVEFS 



cz a a 


Q Q *2 

y y j 


r Q r 

o y 5 


Q Q A 

y 94 


/ 3 D 


Q Q d 

y y 5 


a q n 


y y o 


t a *3 o 


yy / 


T *~i A C 


O Q Q 

9 9o 


JL U 64 


O Q Q 

9 9 9 


r r n 

D bo 


~\ r\ t\ c\ 
1 00 U 


/I '"7 ft 

4 /2 


n aai 

1001 


619 


1 ft ft ft 

1002 


5 6 9 


"i ft ft ~"> 

1 0 03 


[~ ft <"7 

5 97 


1 0 04 


5 01 


-i r\ r\ n 

10 05 


517 


10 0 6 


574 


-\ ft ft t-i 

10 07 


4 o / 


1 0 Oo 


r n r 

bob 


10 0 9 


54 1 


1010 


4 91 


i a i *i 
1011 


c c a 
5 5 U 


i a i o 
1012 


COT 

bU / 


1 A 1 O 

1013 


a c *3 
5b J 


1 A 1 /I 

1 0 14 


bUJ 


"1 ft 1 r~ 

1015 


r a o 

bUo 


1 A 1 ^ 

1016 


r r Q 

55 9 


"1 A "1 '""7 

1017 


53 2 


1 A 1 O 

1018 


595 


I ft i 

1019 


r~ r\ 

5 97 


i ft o ft 
102 0 


t~ ft 

5 6 0 


i ft ft I 
1021 


6 81 


-i ft ft ft 
1022 


r r rv 

659 


1 ft ft "5 

102 3 


/3 6 


i ft ft /i 
102 4 


/TOO 

o o 9 


1 A O C 

102 5 


£2 *D A 


T ft ft /T 

102 6 


•"7 /I ^ 
/ 4 O 


1 ft ft »"7 

102 / 


r /i o 

54 o 


"1 ft ft O 

102 8 


C C 'f 

bb / 


i ft ft Q 
102 9 


C A O 

54 o 


T ft ft ft 

103 0 


/I /-r 

4 o 5 


1031 


rr O ""7 

5 9/ 


T ft ft ft 

103 2 


r~ ^ ""7 
5 / / 


-i ft ft -5 
103 3 


596 


1034 


559 


1035 


523 


1036 


615 


1037 


543 


1038 


533 


1039 


584 


1040 
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YKKEGEQVEFSF 


548 


1041 


KKEGEQ VE F S F P 


598 


1042 


T^"T"~l/""l T~"l y—VT T-w—% 1—1 / — 1 ■¥— 1 T*^\ T 

KEGEQVEFSFPL 


710 


1043 


E GEQ VE F S F PLA 


1456 


1044 


r~«t T*H ^\T TTH TT /T TT T*\ T* TV TT 

GEQVEFSFPLAF 


1729 


1045 


TT T TTT TT T— l T™\ T TV TT fit 

EQVEFSFPLAFT 


1556 


1046 


AT TTH TT /T TT TTT" TV TT rTlT T 

Q VE F S F PLiAFTV 


1636 


1047 


T TT"I TT /T TT TT T TV T" 1 1 T IT TTH 

VE FS F PLAFTVE 


518 


1048 


EFSFPLAFTVEK 


585 


1049 


F S F PLAFTVE KL 


573 


1050 


S F PLAFTVE KLT 


528 


1051 


FPLAFTVEKLTG 


622 


1052 


PLAFTVEKLTGS 


528 


1053 


T TV TT f ■ IT TTHT/T m y-^i 

LAFTVEKLTGSG 


608 


1054 


AFTVEKLTGSGE 


511 


1055 


FTVEKLTGSGEL 


530 


1056 


TVEKLTGSGELW 


573 


1057 


VEKLTGSGELWW 


477 


1058 


EKLTGSGELWWQ 


543 


1059 


Empty 


571 





(control) 



Panels F and G provide data on sequential alanine 
replacements for selected CD4 polypeptides. 

5 Panel F 



PEPTIDE 


COUNTS 


SEQ ID 






NO: 


Z Z Z Z ZZDTY I CE VED 


5844 


1060 


ZZZZZZATYICEVED 


5921 


1061 


ZZZZZZDAYICEVED 


6362 


1062 


ZZZZZZDTAICEVED 


1301 


1063 


Z Z Z Z Z ZDT YAC E VED 


2583 


1064 


Z Z Z Z Z ZDTY I AEVED 


4483 


1065 


ZZZZZZDTYICAVED 


3154 


1066 


ZZZZZZDTYICEAED 


3432 


1067 


ZZZZZZDTYICEVAD 


3595 


1068 


ZZZZZZDTYICEVEA 


5942 


1069 


Z Z Z Z Z ZDTY I CE VED 


4973 


1070 


ZZZZZZDTYICEVED 


4775 


1070 


ZZZZZZATYICEVED 


4962 


1071 


ZZZZZZDAYICEVED 


4163 


1072 


ZZZZZZDTAICEVED 


1384 


1073 


ZZZZZZDTYACEVED 


3085 


1074 


Z Z Z Z ZZDTY I AEVED 


5128 


1075 


ZZZZZZDTYICAVED 


2587 


1076 


ZZZZZZDTYICEAED 


2499 


1077 
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ZZZZZZDTYICEVAD 
Z Z Z Z Z ZDTY I CE VEA 
Z Z Z Z Z ZDTY I CE VED 
EEVQLLVFGLTANSD 
AEVQLLVFGLTANSD 
EAVQLLVFGLTANSD 
EEAQLLVFGLTANSD 
EEVALLVFGLTANSD 
E E VQAL VFGLTAN S D 
EEVQLAVFGLTANSD 
EEVQLLAFGLTANSD 
EEVQLLVAGLTANSD 
EEVQLLVFALTANSD 
EEVQLLVFGATANSD 
EEVQLLVFGLAANSD 
EEVQLLVFGLTTNSD 
EEVQLLVFGLTAASD 
EEVQLLVFGLTANAD 
EEVQLLVFGLTANSA 
EEVQLLVFGLTANSD 
EEVQLLVFGLTANSD 
AEVQLLVFGLTANSD 
EAVQLLVFGLTANSD 
EEAQLLVFGLTANSD 
EEVALLVFGLTANSD 
EEVQALVFGLTANSD 
EEVQLAVFGLTANSD 
EEVQLLAFGLTANSD 
EEVQLLVAGLTANSD 
EEVQLLVFALTANSD 
EEVQLLVFGATANSD 
EEVQLLVFGLAANSD 
EEVQLLVFGLTTNSD 
EEVQLLVFGLTAASD 
EEVQLLVFGLTANAD 
EEVQLLVFGLTANSA 
EEVQLLVFGLTANSD 
THLLQGQSLTLTLES 
AHLLQGQSLTLTLES 
TALLQGQSLTLTLES 
THALQGQSLTLTLES 
THLAQGQSLTLTLES 
THLLAGQS LTLTLE S 
THLLQAQSLTLTLES 
THLLQGASLTLTLES 
THLLQGQALTLTLES 
THLLQGQSATLTLES 
THLLQGQSLALTLES 



2 / U b 


T A T Q 
1 U / O 


£T "5 A CZ 


T A *"7 O 
10/9 


cz cr a A 


1 A Q A 
1 0 O 0 


1 Q c; Q o 
1 o 5 o 2 


T A Q 1 
1 U O 1 


T C O D a 

16220 


T A Q n 

1082 


T A O O A 
14 2 2 0 


T A Q O 

1083 


1 O 1 O /I 

lolz4 


1 A O A 

1084 


t r\ a c\ r\ 

108 9 0 


~i r\ o a 

1085 


i 1 o c n 

11258 


108 6 


11954 


T A O *~7 
108 / 


1 J J 1 / 


1 A O O 

1088 


9573 


i r\ o c\ 

108 9 


19348 


1 A A A 
10 90 


104 08 


1091 


19973 


10 92 


2 0100 


t Am 

10 93 


193 90 


1 0 94 


1 *~7 C Q A 

1/684 


n a n cr 
10 95 


1 Q O O *"7 

1822 / 


1 A Q C. 

10 9 6 


1 O ""7 O O 

19/38 


T A O ""7 
109 / 


213 3 8 


i Ann 

10 9 8 


i / r a A 

14590 


1 C\ A A 

1099 


13 2 13 


1 1 AA 

1100 


162 96 


1101 


13415 


1 102 


12 6 03 


1 103 


13 690 


1104 


16286 


1105 


114 8 0 


1106 


182 54 


1 107 


*i r\ c\ 'i o 

1997 8 


1 1 AO 

1108 


"1 O O /? "5 

18863 


110 9 


o a a o i 
2 U Uz 1 


1 1 1 A 
1 1 1 U 


192 0 0 


Till 
1111 


T «-7 Q O Q 
1 / 92 O 


1 1 1 O 
1112 


O O O A £T 

2 2 2 0b 


1 1 "1 O 
1113 


T Q *7 O 1 

18/21 


111/1 
1 1 14 


/ /56 


i i i r 

1115 


A /"AA 

8 602 


1116 


6 931 


1 1 1 T 
111 / 


•-7 O 

/ 683 


1 1 1 O 

1118 


7701 


1119 


4578 


1120 


8471 


1121 


4238 


1122 


8659 


1123 


4430 


1124 


8158 


1125 
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TT4T.T.nr i nQT TftTT.17C 


4 ft O 
*± o o yj 


1 1 


TWT.T.nnnQT.TT.aT.yQ 
1 XlJLiJLiyovO J-J J-Lrt-J-jJliO 


1 1 CQQ 
X X O J? -7 




TUT.T nrOQT TT T2\PQ 


ft £9 

o o ^ 


1 I Op 




9 R Q ^ 

^ -J D 


1 1 OQ 
J_ _L *L _7 


TUT ,T .OnOQT .TT .TT .T? A 


3 O *± -7 


11 jU 


TT-TT T r\r*Pi QT TT TT TP Q 


O D ft D 


11j1 


tut t nrnQT tt tt tvq 

1 iiij -U^LjS^o JLi 1 J_j 1 JUxliO 


4707 
*± / O / 


1 1 07 
11j^ 


7\ T_rT T OfHCT TT TT T7Q 


^ ft O C 


1 1 -i-i 
X J- -5 O 


rpAT T HPnCT TT TT UQ 


3 U X 


X X ,3 ft 


tust nrncT tt tt ttc 




1 1 -jc 


TUT A OPOC T TT TT T7C 


ci on 
olzu 


1 1 


TUT T APHCT TT TT TTC 

1 nj_iJUf\Ljy o-U 1 J_i 1 j_iJio 


O Q C C 


1 1 J / 


tut t nancT tt tt do 
1 rlJ_iJ_i^/\y oJ_i 1 J-j 1 J-jiio 


/r *3 Q "3 


1 1 *37 


TUT T C\C* A CT TT TT TTO 

1 njj±jyLxf\olj 1 J-j 1 J_) iii o 


iyjj 




tut t nnn7\ t tt tt t?c 
1 rl J_i J-jS^ Ljv^/\JLi 1 J_j 1 LeiO 


CI CI 

O X 3 X 


1 1 / A 

x xfi U 


TUT T nPHQ3\TT TT T?Q 




X X ft X 


muT T nPHCT AT TT "C G 


ft / ft y 


1 1 A O 

x X ft Z 


THLLQGQSLTATLES 


813 


1143 


THLLQGQ S LTLALE S 


8147 


1144 


x ii i ii ivvJyuiJ J- J-J J. r^.i-i kj 


7 97 


1 1 4 5 

X XT J 


THLLQGQSLTLTLAS 


2193 


1146 


THLLQGQSLTLTLEA 


7984 


1147 


THLLQGQ S LTLTLE S 


5947 


1148 


Empty (control) 
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Panel G 






PEPTIDE 


COUNTS 


SEQ ID 



NO: 



GEQVEFSFPIjAFTVE 


20691 


1149 


AEQVEFSFPLAFTVE 


18546 


1150 


GAQVEFSFPLAFTVE 


17733 


1151 


GEAVEFSFPLAFTVE 


17500 


1152 


GEQAEFSFPLAFTVE 


14764 


1153 


GEQVAFSFPLAFTVE 


16668 


1154 


GEQ VEAS F PLAFT VE 


6793 


1155 


GEQVEFAFPLAFTVE 


21681 


1156 


GEQVEFSAPLAFTVE 


7767 


1157 


GEQ VE F S FALAFTVE 


20480 


1158 


GEQVEFSFPAAFTVE 


10024 


1159 


GEQVEFSFPLTFTVE 


17397 


1160 


GEQVEFSFPLAATVE 


10130 


1161 


GEQ VE F S F PIAFAVE 


20627 


1162 


GEQ VE F S F PLAF TAE 


18797 


1163 


GEQVEFSFPLAFTVA 


18371 


1164 


GEQVEFSFPLAFTVE 


17662 


1165 


GEQ VE F S F PLAFTVE 


19190 


1166 


AEQVEFSFPLAFTVE 


18042 


1167 



GAQVE F S FPLAFTVE 
GEAVEFSFPLAFTVE 
GEQAE F S FPLAFTVE 
GEQVAF S FPLAFTVE 
GEQVEASFPLAFTVE 
GEQVEFAFPLAFTVE 
GEQVE F S APLAFT VE 
GEQVE F S F ALAFT VE 
GEQVEFSFPAAFTVE 
GEQVE F S F PLTFT VE 
GEQVE FSFPLAATVE 
GEQVEFSFPLAFAVE 
GEQVEFSFPLAFTAE 
GEQVEFSFPLAFTVA 
GEQVEFSFPLAFTVE 
ZZZZZZDTYICEVED 
ZZZZZZDTYICEVEZ 
ZZZZZZDTYICEVZZ 
ZZZZZZDTYICEZZZ 
ZZZZZZDTYIZZZZZ 
ZZZZZZZTYICEVED 
EEVQLLVFGLTANSD 
EEVQLLVFGLTANS Z 
EEVQLLVFGLTANZZ 
EEVQLLVFGLTAZZZ 
EEVQLLVFGLTZZZZ 
EEVQLLVFGLZZ Z Z Z 
EEVQLLVFGZ Z Z Z Z Z 
EEVQLLVFZZZZZZZ 
EEVQLLVZZZZZZZZ 
ZEVQLLVFGLTANSD 
Z Z VQLLVFGLTANSD 
ZZZQLLVFGLTANSD 
ZZZZLLVFGLTANSD 
ZZZZZLVFGLTANSD 
Z Z Z Z Z Z VFGLTANS D 
ZZZZZZZFGLTANSD 
ZZZZZZZZGLTANSD 
EEVQLLVFGLTANSD 
THLLQGQSLTLTLES 
THLLQGQSLTLTLEZ 
THLLQGQSLTLTLZZ 
THLLQGQ S LTLT Z Z Z 
THLLQGQSLTLZ Z Z Z 
THLLQGQSLTZZZZZ 
THLLQGQSLZZZZZZ 
THLLQGQSZZZZZZZ 
THLLQGQ ZZZZZZZZ 
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18079 


1168 


19756 


1169 


13000 


1170 


13930 


1171 


6533 


1172 


20072 


1173 


7378 


1174 


19480 


1175 


10589 


1176 


18318 


1177 


9572 


1178 


19516 


1179 


16765 


1180 


18187 


1181 


18219 


1182 


5017 


1183 


5421 


1184 


2166 


1185 


922 


1186 


564 


1187 


3031 


1188 


23357 


1189 


15808 


1190 


16496 


1191 


14097 


1192 


16473 


1193 


10516 


1194 


10372 


1195 


7333 


1196 


1098 


1197 


16716 


1198 


5281 


1199 


4310 


1200 


1026 


1201 


664 


1202 


779 


1203 


760 


1204 


657 


1205 


18040 


1206 


10850 


1207 


10269 


1208 


4668 


1209 


908 


1210 


844 


1211 


475 


1212 


548 


1213 


570 


1214 


442 


1215 
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ZHLLQGQSLTLTLES 


11445 


1216 


ZZLLQGQSLTLTLES 


11631 


1217 


ZZZLQGQSLTLTLES 


7993 


1218 


ZZZZQGQSLTLTLES 


6887 


1219 


ZZZZZGQSLTLTLES 


3305 


1220 


ZZZZZZQSLTLTLES 


4453 


1221 


ZZZZZZZSLTLTLES 


1086 


1222 


ZZZZZZZZLTLTLES 


1201 


1223 


THLLQGQ S LTLTLES 


9756 


1224 


GEQVEFSFPLAFTVE 


18856 


1225 


GEQVEFSFPLAFTVZ 


16222 


1226 


GEQVEFSFPLAFTZZ 


12535 


1227 


GEQVEFSFPLAFZZZ 


11384 


1228 


GEQVEFSFPLAZZZZ 


5846 


1229 


GEQVEFSFPLZZZZZ 


4749 


1230 


GEQVEFSFPZZZZZZ 


2208 


1231 


GEQVEFSFZZZZZZZ 


3277 


1232 


GEQVEFSZZZZZZZZ 


742 


1233 


Z EQVE F S F PLAFTVE 


19736 


1234 


ZZQVEFSFPLAFTVE 


18684 


1235 


Z Z Z VE F S F PLAFTVE 


12892 


1236 


ZZZZEFSFPLAFTVE 


12166 


1237 


ZZZZZFSFPLAFTVE 


2134 


1238 


ZZZZZZSFPLAFTVE 


1454 


1239 


ZZZZZZZFPLAFTVE 


1391 


1240 


ZZZZZZZZPLAFTVE 


1489 


1241 


GEQVEFSFPLAFTVE 


18867 


1242 


empty (control) 


580 





Example 11 

This example characterizes CD4 receptor sequences found to 
have HIV gpl20 binding activity in screening tests. Panel 
5 A displays information obtained from sequential replacement 
of amino acid residues by alaninyl residues. In panel A, a 
(+) signifies a decrease in binding affinity whereas a (>) 
indicates that replacement of the residue by an alaninyl 
residue yields an increase in binding affinity. Sequences 
10 are shown with amino- terminus at the top and the carboxyl- 
terminus at the bottom. Right and left sides are from 
independent assays . 



Panel A. 



105-113 


116-130 


131-145 


216-229 



69 



D 


E 


T 


G 


T 


E 


H 


E 


++Y+ + 


V 


L 


Q 


+ 1 + 


+Q+ 
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+ +E + + 


V 




D 


S 


E 



Panel B indicates the effect on binding affinity when 
successive amino acid residues are deleted, either from the 
5 amino- terminus (right side-symbols) or the carboxyl- 
terminus from the bottom (left side-symbol) . A (+) 
signifies a decrease in binding affinity, and the 
underlined residues indicate which residue was the last 
residue to be serially deleted. 

10 

Panel B. 



105-113 


116-130 


131-145 


216-229 


D+ 


E 


T 


G 


T 


E + 


H 


E 


Y 


V+ 


L+ 


Q+ 


I 


Q+ + 


L+ 


V+ 


C 


L+ + + 


Q+ + 


E+ + + 


+ ++E 


L+ + + 


G+ + 


F+ + + 
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V+ + + 


Q+++ 


S + + + + 


+E 


++++F++++ 


+++S+++ 


++++F++++ 


D 


+ +G 


+ + +L 


+ + + P 




+L 


+ + +T 


+ + + L 




T 


+ ++L 


+ +A 




A 


+ +T 


+ + F 




N 


++L 


+T 




S 


+E 


+V 




D 


S 


E 
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All publications cited herein are hereby incorporated 
by reference to the same extent as if each publication were 
individually and specifically indicated to be incorporated 
by reference and were set forth in its entirety herein. 
5 While this invention has been described with an 

emphasis upon preferred embodiments, it will be obvious to 
those of ordinary skill in the art that variations of the 
preferred embodiments can be used and that it is intended 
that the invention can be practiced otherwise than as 
10 specifically described herein. Accordingly, this invention 
includes all modifications encompassed within the spirit 
and scope of the invention as defined by the following 
claims . 



